-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Smartquotes #5
Smartquotes #5
Conversation
This will only apply to test fixtures when they get re-generated.
The test fixture is copied from markdown-it, with the rust code being just a copy of the tables file with modified path and plugin set. The core rule is to be filled; for now it is a no-op.
@rlidwka pointed out that this is how the JS implementation handles this. Confirmed by adding a unit test to the JS lib.
The `plus_minus` test works! The rest should be a walk in the park now. 😆
`cargo test -- --test fixtures_markdown_it_typographer_txt::ellipsis` # Conflicts: # src/plugins/extra/typographer.rs
# Conflicts: # src/plugins/extra/typographer.rs
# Conflicts: # src/plugins/extra/typographer.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
didn't check smartquotes.rs implementation yet, typographer implementation looks good (maybe merge it separately)
> | ||
{ | ||
fn run(root: &mut Node, _: &MarkdownIt) { | ||
let text_tokens = all_text_tokens(root); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JS has pretest that returns early if input has no quotes at all (to improve performance by skipping the heavy logic in most cases)
maybe all_text_tokens
can also check if quotes ('
or "
) are found, and return None/empty vec as a shortcut
needs benchmarks though
//! | ||
//! The solution proposed here is to first compute all the replacement | ||
//! operations on a read-only flat view of the document, and _then_ to perform | ||
//! all replacements in a single call to `root.walk_mut`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That solution is impressive. Also quite sad that it can't be done simpler.
Maybe the solution is to get rid of AST tree, as JS version doesn't have any. I tried to add AST as an improvement, but rust just isn't very good with tree manipulation (out of the box and without unsafe that is).
Indeed, I left the typographer PR #4 open because I figured it's kinda ready, while this one here definitely needs "some" cleaning. But this PR really depends on the typographer as well, so I based the smartquotes branch off off the typographer one. Thank you so much for the feedback! I'll go over it when I'm a little less tired. Some of it might also go into #4 then, we'll see. :) |
This one turned out to be super difficult because we don't want to be making these replacements inside linkified URLs. Luckily, @rlidwka had the preceding commit ready to prevent that from happening. This fixes the last tests from the markdown-it fixture.
I had somehow missed that pattern when reading the docs on `once_cell` but now that I know about it, this is _the_ obvious way to do it. # Conflicts: # src/plugins/extra/typographer.rs
Thanks to @rlidwka who provided this piece in an act of kind teaching. :)
42c0241
to
de553e6
Compare
Integrated the first batch of feedback, some still to do. This still includes the commits from #4 but everything starting from Generate smartquote test cases (8947b04) belongs to this PR. |
de553e6
to
37b29b5
Compare
This very closely follows the JS implementation
This way the main loop actually becomes quite small :) # Conflicts: # src/plugins/extra/smartquotes.rs # Conflicts: # src/plugins/extra/smartquotes.rs
Give that we don't use nested loops any longer, it is now easier to use a `for` loop. # Conflicts: # src/plugins/extra/smartquotes.rs # Conflicts: # src/plugins/extra/smartquotes.rs
These were just screaming to be rewritten a little. They're still not terribly intuitive, but at least concise. # Conflicts: # src/plugins/extra/smartquotes.rs
The regex solution is concise, but has its drawbacks: 1. We need to call `to_string` to use a regex. 2. Using the whole regex machinery for single characters seems like overkill. 3. There was already a perfectly usable function in the library. :)
Previously the quotes were implemented to be `const` generics, but there was no interface exposed to actually use this conveniently.
37b29b5
to
2fc9422
Compare
These are typically used in conjunction, so now is probably the right time to enable them by default.
a518605
to
dc49f62
Compare
Okay, I went over most of the feedback now. I also went ahead and added the two modules (typographer and smartquotes) to the extras module by default. Together with an example. I used a second call just to keep the lines in the example short. |
I have to say I don't actually understand what the |
It checks with lowest possible versions of dependencies to make sure lower semver bounds are correct. In short: forget about it, probably Is it good to merge as is, or did you want to add anything else? I'll merge if it is, and fix tests on my own. |
For me it's good as is. |
Ah damn, I have never needed the CLI so yeah I totally forgot about that. |
This is the follow-up to #4 that I mentioned before, so it includes all of those commits, plus "a few" more.
I should first say thanks again for reference implementation in JS, wouldn't have been able to do it without. 🙂
Of course this still needs linting and squashing, and I haven't run it through benchmarking at all. But at least I managed to avoid making string copies all over the place, so that's nice.
I had to add some lifetime annotations to the
walk
function to do that. Is that okay, or do you see a more elegant way to do this?If you do want to crawl through the unsquashed history here you'll find that the
IT WORKS
commit reads a lot like the JS version. However, I found that to be too long, so I started breaking things out, and the result is actually fairly readable, if I say so myself. 😛