Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using a proper Markdown parser to remove non-textual content #125

Open
Eilon opened this issue Jul 8, 2022 · 0 comments
Open
Labels
enhancement New feature or request future

Comments

@Eilon
Copy link
Collaborator

Eilon commented Jul 8, 2022

Related: #62

If we use a proper Markdown parser it would be great to take some Markdown like this:

> Hey brothers, check out this radical *code*:
> ```javascript
> var x = document.foo(...);
> var bananas = monkey.pole();
> ```
> And also what if we frobbed the `bananas` variable and instead used a `fruit` syntax model?
> # Alternative solution
> What if instead of fruit, we supported the `MSFT.Botany` package?

And trim it down to something like this:

> Hey brothers, check out this radical code:
> And also what if we frobbed the  variable and instead used a  syntax model?
> Alternative solution
> What if instead of fruit, we supported the  package?

Note that all code blocks between backticks, all formatting marks (asterisk, underline, etc.), header marks (after hashes) are removed.

And then run the textual analysis on only those sentences.

@jonathanpeppers jonathanpeppers added the enhancement New feature or request label Aug 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request future
Projects
None yet
Development

No branches or pull requests

2 participants