Modularization of library #746

Nalinc · 2016-05-02T13:21:55Z

There is a rising demand to move towards modularization of *marked * and separate different components in different files(grammar-rules, lexers, parsers and renderers). This though not have been mentioned explicitly, is evident from issues #743, #717, #704.

A properly modularized code would make maintenance easy and also allow addition of new grammar, implementation of custom parser, renderer and would address separation of concerns properly.

Like may be we can have an /src folder to fiddle with the code in development and use /lib for deliverable file.

The text was updated successfully, but these errors were encountered:

joshbruce · 2017-12-03T17:37:56Z

@Nalinc: See #956

This idea is intriguing to me, primarily from the perspective of making it easier for the library to communicate what and how it works from an architectural perspective.

I don't know enough about Marked to do this work myself. What would you propose as a full folder structure given what's already here?

Nalinc · 2017-12-04T07:57:14Z

hey @joshbruce

I worked on marked a bit in past and even changed some of its functionalities based on my needs back then. One thing I found annoying was keeping track of things in the same file even if I was working on a different component. Here's what I did back then:

I moved each of the primary components in a separate file in /src. This way I was left with core.js, grammar.js, lexer.js, parser.js and renderer.js in /src folder.
Each of these components was written in IIFEs so they weren't messing with global scope.
Finally, I had a grunt task that'd combine these components from /src and create a bundled file in /dist`

I found it easy because having independent components in different files, I could easily change/extend one component without worrying about others. The code became smaller, readable and more manageable too. I feel comfortable with marked and am open to make a PR on this.
Regarding #956, is there an exiting card for this or should I make one and move it to "under consideration"?

joshbruce · 2017-12-23T20:20:10Z

#420

#660

joshbruce · 2018-01-05T23:59:06Z

@Nalinc: Sorry it's taken a moment - I totally didn't see (or catch) your comments and questions at the bottom.

That architecture sounds about right to me; given what I've seen in the issues and PRs. Let me tag in a few of the other recently active community members to weigh in as well (@UziTech, @Feder1co5oave, @KostyaTretyak). There's not an existing card beyond this one; so, congrats on that. :)

Regarding submitting a PR, this is where I would like more feedback from the community. Our focus for the 0.4.0 milestone is to try and get functional issues taken care of. We have this card marked for the 0.5.0 milestone...but, if we can do it without much negative impact to cleaning up the other issues - I think it could actually help make things a bit more obvious on what's causing some of the other issues. I also know @UziTech is trying to make some improvements to all things testing, which will help the issue house cleaning and worthwhile for the 0.4.0 milestone as well.

joshbruce · 2018-01-28T19:42:04Z

Been thinking about this a lot lately - blaming the day job as I'm a lead architect and trying to build up my chops in the software architecture practice. Mainly thinking about the open/closed principle and a way to make marked easy to extend while maintaining speed.

Would like to know the purpose of each of these, as I can't really discern them from looking at the code:

core.js - ??
grammar.js - ??
lexer.js - ??
parser.js - ??
renderer.js - ??

Most of the flavors of Markdown I've seen don't really go too far afield from the original Markdown specification - what I believe we call "pedantic". Looking at this as an object-oriented problem to solve, we might end up with something like the following as a class inheritance tree (based on specification):

Pedantic (Markdown.pl)
- CommonMark
  - GFM (from what I understand from comments made by @Feder1co5oave - GFM is more like an extension of the CommonMark spec)

So, if we make "Pedantic" the default - it can be its own class. Then CommonMark can extend Pedantic - overriding what needs overriding and adding anew what needs adding. The GitHubFlavored can extend CommonMark...and so on. At which point, adding in new flavors should be relatively easy (famous last words). CriticMarkup, for example, would extend Markdown.

Of course, this introduces an interesting concept - combining specs in an ad-hoc fashion. So, instead of using inheritance to extend a base class (because Inheritance is the Base Class of Evil) we could have renderers for each of the specs - each with only the parts that are different from the others. So, using GFM would include the stuff from CommonMark and Pedantic - without inheritance - it would be through dependency or some other type of inclusion.

But, again, I'm not sure what the purpose of each of the recommended JS files is. What is the difference between a parser and a renderer and a lexer??

Also, what about allowing consumers to skip certain parsing steps (a text editor that says, "Don't allow headers," which would no longer need to go through the header check and parse steps)?? What about changing the order of parsing, does that affect performance or output at all??

joshbruce · 2018-02-10T02:57:19Z

To ES6 or not to ES6 is one question on the table.
I also learned more about JS recently - not in my usual language stack or selection (I know, weird, but I'm learning - lol). If we went totally prototypal based on specification - that could also be interesting totally making sh*t up here to demonstrate a non-ES6 concept (no fake classes):

Daring = {}

Daring.prototype.paragraph = function(markdown) {
  // whatever this is against the param :)
  // /^([^\n]+(?:\n?(?!hr|heading|lheading| {0,3}>|tag)[^\n]+)+)/
  return result;
}

CommonMark = {}

CommonMark.prototype.paragraph = function(markdown) {
  return Daring.paragraph(markdown);
}

GFM = {}

GFM.prototype.paragraph = function(markdown) {
  return Daring.paragraph(markdown);
}

(This could be horrible JS, btw)

I should be able to override all the things at that point, yeah? (Hyper-extensibility.) We capture the exact differences between each of the specifications. I mean, unless I'm just a complete idiot, which is possible - this move would resolve most of the "feature request" problems that got us into hot water in the first place.

Having said that, I have no idea how slow it would be - but Marked isn't the fastest kid on the block anymore anyway.

We could also bail on the the configuration options for choosing which flavor, because you're always doing it directly - then divide the architecture by flavor to minimize compilation and package size - for example, Daring would be the lightest weight solution for client-side...but have the least amount of extended features. GFM would require CommonMark - override and add where need be.

Add builds for minified versions of each flavor to the npm build script @Feder1co5oave just finished.

Granted I still don't know what the difference is between the lexer, the parser, and all that jazz. But, I mean, if we're going to be doing breaking changes anyway, might as well reconsider our lives a bit on the way. lol

Again, just spitballing from the sidelines based on things I've done in other languages to solve similar problems.

Tagging a gentleman by the name of @colinalford - He's been helping me with my JS, is wicked smart, and should probably not add this to the list of things he's already doing. rofl

joshbruce · 2018-02-10T03:00:48Z

We can also do it in such a way that the changes aren't breaking for those who haven't modified or extend it - I think. Leave the same entry API as the entry point for the facade to do everything else.

UziTech · 2018-02-10T05:05:19Z

The question is not really "To ES6 or not to ES6" because I sure don't want to stick with ES5 forever. It's more of a question of when to ES6.

ES6+ has lots of improvements for code readability and modularization. We will want to keep an eye on the benchmarks though.

P.S. ES2018 will come with named capture groups which would make the RegExp heavy code much more readable.

joshbruce · 2018-02-10T05:09:27Z

@UziTech: How so? Can you drop a small code sample? Pseudo is good enough as long as it does es6 things. :)

UziTech · 2018-02-10T05:10:16Z

I think before we start modularizing the code base we need to do some serious work on automating testing and benchmarking in node and a browser environment like chrome headless or phantomjs

joshbruce · 2018-02-10T05:12:56Z

That's fair. Before that (or during it), we should probably keep working on the fixing known issues thing as well - this is just future visioning.

What do you propose we need? What stops us from getting any of them? (CI against PRs, for example.) Maybe a ticket on this? I don't think we have one yet.

joshbruce · 2018-02-10T05:13:21Z

I'll flag that one as part of 0.4.0 milestone to make sure we don't go out of order.

UziTech · 2018-02-10T05:17:44Z

@joshbruce Here is the same code with ES6 classes as your example above:

class Daring {
  paragraph(markdown) {
    // whatever this is against the param :)
    // /^([^\n]+(?:\n?(?!hr|heading|lheading| {0,3}>|tag)[^\n]+)+)/
    return result;
  }
}

class CommonMark extends Daring {}

class GFM extends Daring {}

Much less boilerplate code

UziTech · 2018-02-10T05:28:39Z

As far as automated testing is concerned, it would be nice if @chjj could set up CI for PRs on Travis for mac and linux and AppVeyor for windows. Both are free for open source projects.

I could create the CI scripts to automatically run the tests on multiple versions of node and different browser engines including benchmarks.

That would help a lot when modularizing the code base to prevent accidentally introducing bugs or performance hits.

joshbruce · 2018-02-10T05:42:28Z

Yep. This is where @colinalford really comes into play for this conversation because he knows more about it than me but the class system in JS under the hood isn't true classing is it? Not like the ones we find in Swift, PHP, Objective-C?

I mean every object in JS is not only a definition (a class) but an instance as well; so, by boilerplate are we talking about the "prototype" piece? And functions are first class citizens, yeah?

Inheritance of classes can get tricky from the perspective that I can't just inherit parts and pieces (https://youtu.be/bIhUE5uUFOA). And, if this is what makes it easier to modularize, I'm not sure we need/want ES6 for most of it.

This is also what makes the prototype system interesting...I can inherit just the things I want - the banana without the gorilla holding it. Similar to traits in PHP, for example.

Agreed on the automated CI piece.

Is that it; or, is there more we could/should be doing on that score?

(I don't know how we can do the CI thing and actually maintain the repo being under @chjj . I mean it took us a lot of work to get to having me able to publish things, thanks for that. Does GitHub have a "I would like to take over this thing" like NPM does?)

UziTech · 2018-02-10T07:18:46Z

Right, the class syntax is just hiding all the boilerplate "prototype" pieces. The code is doing the same thing just much easier to read.

We can just copy parts if we need to:

class CommonMark {
  // Daring has many other things but
  // we only want the paragraph
  paragraph(markdown) {
    return Daring.paragraph(markdown);
  }
}

again same as the ES5 code, just cleaner.

ES6 has many benefits over ES5 especially engine optimizations but since we will be transpiling it to ES5 anyways developer sanity will be the only benefit we will get until we can drop older browser/node versions.

That being said, keeping up with current ES versions will help us get those optimizations when we can drop the old version since we won't have to do an entire rewrite. All we will have to do is change our transpiler configuration.

UziTech · 2018-02-10T07:22:03Z

Once the class fields proposal lands in ES it will be even easier:

class CommonMark {
  // Daring has many other things but
  // we only want the paragraph
  paragraph = Daring.paragraph;
}

joshbruce · 2018-02-11T13:15:19Z

@UziTech: Fair. And I agree. Now the next question, not a recommendation, just want to cover the base:

What about TypeScript?

It adds keywords that do not exist in the ES spec as of today, but might in the future; could the same argument be made and does it hold the same value?

colinalford · 2018-02-11T14:42:43Z

This is a great Crockford video, and he advocates what he calls a class free paradigm starting 42:50: https://youtu.be/DePE0ffiMf4

No need for classes, no need to worry about inheritance. Write a constructor, add the parts you want from other classes, and return the new object. Don't have to wait for the new spec, either. Very cool.

I am not the biggest fan of classes in JavaScript since it's a false construct. I write Typescript everyday for an Angular project, though, and I don't think it really matters given you understand the underlying JavaScript.

Re: using Typescript, I really like it for the type system. It has been incredibly helpful organizing my code. However, it's kind of a pain to setup and adds a bunch of complexity with the build. I don't think I would advocate adding it for such a small project -- ~1300 LOC seems manageable without a type system.

UziTech · 2018-02-11T19:57:28Z

@KostyaTretyak has done a pretty good job of rewriting marked using TypeScript in marked-ts

I'm not opposed to TypeScript but I feel like that would be an extra barrier to getting contributors. Anyone who would want to submit a pull request would need to learn TypeScript.

The class-free method seems clever, which is usually not a good thing in the long run. Almost all code smells were once clever hacks.

My vote is for sticking with modern standards. It seems like the easiest way to not fall behind in the future.

colinalford · 2018-02-11T21:44:13Z

@UziTech I hear your concerns about being clever, but it is generally accepted in OOP that we should prefer composition over inheritance. The class free approach means using only composition while ignoring inheritance.

There has been a lot of debate in the community since the introduction of classes about whether we should even use them or not. I'm not sure I would say its a modern standard just because its part of the modern spec.

But that's JS in a nutshell. Its not like we ever had a PEP like Python or whatever the PHP style guide is. Probably Crockford's Good Parts is the closest thing in the past, and AirBNB's style guide is pretty popular nowadays, so maybe pick that (they prefer classes to prototypes, btw).

IMO as long as everyone understands how Javascript really works and that classes don't actually exist in JS, then it is a nice syntax to organize your code. My fears are based around Java/C# devs working on ES6/Typescript projects for the first time without learning the differences between the languages.

Also, here's a really good article from Dan Abramov titled "How to Use Classes and Sleep At Night": https://medium.com/@dan_abramov/how-to-use-classes-and-sleep-at-night-9af8de78ccb4

colinalford · 2018-02-11T21:45:48Z

@joshbruce Actually, that's a good recommendation. Use AirBnb's Style Guide for the project.

joshbruce · 2018-02-11T22:20:24Z

@colinalford: See PHP-fig - PSRs 1 and 2 specifically - https://www.php-fig.org :)

Agreed on the AirBnB style guide. The USWDS also started using it. It definitely seems to be becoming a more accepted standard. Having said that, as I'm sure you already know about me, ultimately it's up to the people actually maintaining the majority of the codebase. ;)

(@Feder1co5oave and @UziTech, AirBnB is definitely worth looking into.)

I think that's my only issue using the class keyword in JS - it seems like a lie and hides the more functional-like compositional approach of JS. So, it's like JS is trying to be more class-like (PHP, Java, et al), while I'm trying to make PHP feel more functional...the world has gone topsy-turvy I tell you!

@UziTech: This is what I'm referring to on that score. https://github.com/8fold/php-html-component/blob/master/tests/ComponentTest.php

joshbruce · 2018-02-11T22:34:22Z

@colinalford: Having said that, we are talking linters:

#999 - #1020

UziTech · 2018-02-11T23:03:37Z

@colinalford sticking with modern specs is what I meant by modern standards. Specs aren't always the best way to do things but they are the way that most people know and the starting point of future specs. If we want to keep this project from becoming stale again we will want to keep up with the specs.

I'm not married to the javascript class either but I think this:

class Markdown {
  paragraph() {
    // code...
  }
}

is easier to read and write than:

function Markdown(){}

Markdown.prototype.paragraph = function () {
  // code...
}

If we want something to be a javascript "class" we might as well call it so.

The class syntax is not really why I want to use ES6+ that was just an easy example of making the code easier to read.

ES6+ also comes with:

Block scoped variables (i.e. let, const)
Default parameters
Destructuring
Template literals
Arrow functions
Async/Await
and many more...

Again, since we are compiling to ES5 in the build, the only benefit to any of this is ease of development.

Being that this is a community project with lots of people working on it (as opposed to a single person writing most of the code) remember that which ever code style we pick will need to be learned and written by anyone who submits a PR. Sticking with modern specs seems to be the easiest way to be sure that most people are able to easily help with the project.

joshbruce · 2018-02-11T23:21:37Z

@UziTech: Agreed on the easier to write; however, it seems a bit more awkward to cherry pick functionality - but I suppose, technically, they're static methods.

Also, agree with @colinalford, that as long as everyone understands what's going on under the hood re classes, it's not that big of a deal.

I do appreciate a lot of what ES6 offers. I also appreciate the readonly brought in by typescript (as the let and const are not the same as what we see in something like Swift).

I might have to call a bit on the "lots of people working on it" piece a bit. Most of the PRs submitted since taking over have been from 2 people. Most of the "contributions" have been in the form of issues - not fixes - by other people...this, in my opinion, is what made Marked go stale after reading the issues. People just asking for more and more stuff and people either letting it through (not acting like the hardcore filter @chjj used to be when he was still participating) and the other contributors and Chris seeming to walk away - it just became too much to sustain - but some folks didn't want to walk away when we put the call out - it's got potential.

To be honest, if you (specifically) and @Feder1co5oave hadn't showed up, I was actually going to start systematically shutting it down and probably closing the project altogether - after having been made an owner of the package.

The world doesn't need another dead NPM package to choose from based on its popularity by being one of the first. #956 - for future reference while working on other things.

UziTech · 2018-02-12T02:46:05Z

It will be much easier to cherry pick once private methods are introduced.

"lots" may have been an exaggeration but my point was that this is a project where the only way it will last very long is if it is easy to hand off to the next person that wants to maintain it. I don't think there would have been two or three people willing to take over this project from Chris if it was written in Elm or ClojureScript.

While I enjoy writing TypeScript, I will never use it for one of my projects unless the project is meant for the TypeScript community (like an angular tool).

joshbruce · 2018-02-26T02:52:08Z

Not entirely sure, but it seems like #1077 and the push for a faster move to the 1.0 is clearing scrutiny. It got me thinking about the transition period between 1.0 (marked with current APIs and whatnot) and 2.0 (all the architecture and deprecation changes ever) and backward compatibility while actually making the changes. We might be able to maintain marked.js as an entry point that then references other things based on the setup options.

marked.js
core.js - rules common across supported flavors
grammar.js - maybe not necessary at root
lexer.js - same
parser.js - same
renderer.js - same
/DaringFireball
  core.js, grammar.js, lexer.js, parser.js, renderer.js
/CommonMark
  core.js, grammar.js, lexer.js, parser.js, renderer.js
/GFM
  core.js, grammar.js, lexer.js, parser.js, renderer.js
/[some other flavor]
  core.js, grammar.js, lexer.js, parser.js, renderer.js

Then we could compile minified versions for each flavor to allow users to take only the functionality they need into their projects. We could also modify the way the option settings change things in marked.js itself.

pedantic: true - use DaringFireball indirectly
gfm: true - use GFM indirectly (if both are set to true, use non-pedantic)

// Create more generic setting
flavor: daring|gfm|common-mark|[some other flavor]

Doing the update to the docs has me seeing a lot of setting options I'm curious if we need or want as well. gfm: true, tables: false, for example, why not use CommonMark instead?

joshbruce · 2018-09-29T00:08:01Z

#1347

styfle · 2018-09-29T17:15:03Z

@joshbruce @UziTech This ticket along with #1288 and #1347 seems to be requesting the same thing. What do you think of closing the duplicates?

I would imagine we still want to focus on spec compliance before tearing things apart to modularize, and probably wait for ESM in Node.js to go stable.

Since this ticket appears to be the oldest, should we close the other two and get all this discussion in one place?

joshbruce · 2018-09-29T17:26:13Z

I'm down as long as the overarching thread stays connected.

Definitely sticking to spec compliance for now: https://github.com/markedjs/marked/milestones - doesn't really change until 1.x or 2.x.

I'm not sure we should close #1288 and #1347 to be honest. I'd almost want to keep the newest open, to be honest.

saragarmee · 2018-10-04T00:26:56Z

I'm fine with closing #1347
I'm the one that opened it. I've completed the refactor and it seems to be working, feel free to pull whatever you like from my fork if it's helpful in any way. I did have some questions at the end of my admittedly lengthy work through that I really would love to see answered if possible though.

styfle · 2018-10-04T15:26:00Z

Closing this ticket as a duplicate of #1288

joshbruce added help wanted L2 - annoying Similar to L1 - broken but there is a known workaround available for the issue question labels Dec 3, 2017

joshbruce added this to the 0.5.0 - Architecture and extensibility milestone Dec 25, 2017

joshbruce mentioned this issue Jan 5, 2018

Added modular plugin system #846

Closed

joshbruce added proposal and removed help wanted question labels Jan 5, 2018

This was referenced Jan 7, 2018

Refactor the escape() function to improve performance 10-20% #975

Merged

Fix tests #1002

Merged

sort tests #1004

Merged

This was referenced Jan 21, 2018

Renderer HTML escaping #636

Closed

Pass header flag to Renderer.tablerow method #818

Closed

[Emphasis/strong] Consolidate issues #1036

Closed

Feder1co5oave mentioned this issue Feb 10, 2018

Use uglifyjs CLI instead of Gulp to minify #1046

Merged

This was referenced Feb 28, 2018

Coding standards #1089

Closed

[WIP]: Update contributing #1096

Closed

Markdown flavors #1097

Closed

🚧[Experimenting]: Refactoring marked js🚧 #1102

Closed

joshbruce removed this from the 0.5.0 - Architecture and extensibility milestone Apr 4, 2018

joshbruce mentioned this issue Apr 5, 2018

Fix: TextRenerer.paragraph is not a function #1200

Closed

6 tasks

joshbruce mentioned this issue May 1, 2018

Gfm tables #1245

Merged

6 tasks

styfle mentioned this issue Jun 19, 2018

Question: Why embed marked instead of including as a dependency? marktext/marktext#141

Closed

styfle mentioned this issue Jun 29, 2018

fix issues link references and prototypes #1299

Merged

6 tasks

43081j mentioned this issue Oct 2, 2018

move to ESM and what not #1349

Closed

styfle closed this as completed Oct 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modularization of library #746

Modularization of library #746

Nalinc commented May 2, 2016

joshbruce commented Dec 3, 2017

Nalinc commented Dec 4, 2017

joshbruce commented Dec 23, 2017

joshbruce commented Jan 5, 2018

joshbruce commented Jan 28, 2018 •

edited

joshbruce commented Feb 10, 2018 •

edited

joshbruce commented Feb 10, 2018

UziTech commented Feb 10, 2018 •

edited

joshbruce commented Feb 10, 2018

UziTech commented Feb 10, 2018

joshbruce commented Feb 10, 2018

joshbruce commented Feb 10, 2018

UziTech commented Feb 10, 2018 •

edited

UziTech commented Feb 10, 2018 •

edited

joshbruce commented Feb 10, 2018

UziTech commented Feb 10, 2018

UziTech commented Feb 10, 2018 •

edited

joshbruce commented Feb 11, 2018 •

edited

colinalford commented Feb 11, 2018

UziTech commented Feb 11, 2018 •

edited

colinalford commented Feb 11, 2018

colinalford commented Feb 11, 2018

joshbruce commented Feb 11, 2018

joshbruce commented Feb 11, 2018

UziTech commented Feb 11, 2018 •

edited

joshbruce commented Feb 11, 2018

UziTech commented Feb 12, 2018

joshbruce commented Feb 26, 2018

joshbruce commented Sep 29, 2018

styfle commented Sep 29, 2018

joshbruce commented Sep 29, 2018

saragarmee commented Oct 4, 2018

styfle commented Oct 4, 2018

Modularization of library #746

Modularization of library #746

Comments

Nalinc commented May 2, 2016

joshbruce commented Dec 3, 2017

Nalinc commented Dec 4, 2017

joshbruce commented Dec 23, 2017

joshbruce commented Jan 5, 2018

joshbruce commented Jan 28, 2018 • edited

joshbruce commented Feb 10, 2018 • edited

joshbruce commented Feb 10, 2018

UziTech commented Feb 10, 2018 • edited

joshbruce commented Feb 10, 2018

UziTech commented Feb 10, 2018

joshbruce commented Feb 10, 2018

joshbruce commented Feb 10, 2018

UziTech commented Feb 10, 2018 • edited

UziTech commented Feb 10, 2018 • edited

joshbruce commented Feb 10, 2018

UziTech commented Feb 10, 2018

UziTech commented Feb 10, 2018 • edited

joshbruce commented Feb 11, 2018 • edited

colinalford commented Feb 11, 2018

UziTech commented Feb 11, 2018 • edited

colinalford commented Feb 11, 2018

colinalford commented Feb 11, 2018

joshbruce commented Feb 11, 2018

joshbruce commented Feb 11, 2018

UziTech commented Feb 11, 2018 • edited

joshbruce commented Feb 11, 2018

UziTech commented Feb 12, 2018

joshbruce commented Feb 26, 2018

joshbruce commented Sep 29, 2018

styfle commented Sep 29, 2018

joshbruce commented Sep 29, 2018

saragarmee commented Oct 4, 2018

styfle commented Oct 4, 2018

joshbruce commented Jan 28, 2018 •

edited

joshbruce commented Feb 10, 2018 •

edited

UziTech commented Feb 10, 2018 •

edited

UziTech commented Feb 10, 2018 •

edited

UziTech commented Feb 10, 2018 •

edited

UziTech commented Feb 10, 2018 •

edited

joshbruce commented Feb 11, 2018 •

edited

UziTech commented Feb 11, 2018 •

edited

UziTech commented Feb 11, 2018 •

edited