Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional support to underline elements #357

Open
darklinkpower opened this issue Oct 7, 2023 · 3 comments
Open

Add optional support to underline elements #357

darklinkpower opened this issue Oct 7, 2023 · 3 comments

Comments

@darklinkpower
Copy link

Currently u elements used for underline are not supported but the issue is that there is no markdown equivalent. As an idea, perhaps an option that is disabled by default to process them to italics could be added.

@mysticmind
Copy link
Owner

mysticmind commented Oct 7, 2023

Acknowledge seeing this, let me see what best can be done for this case. What you have suggested is one possible approach.

@darklinkpower
Copy link
Author

darklinkpower commented Oct 7, 2023

After opening the issue I thought about another approach that would more future proof but I don't know if it's feasible to implement:

There could be a UnknownElementsReplacer property in the converter Config. It could be a simple KeyValue-like list that would take the element name as the key and the replacer as the value.

For example:

var htmlToMarkdownConverter = new Converter();
var newReplacer = new replacer("u", "*"); // u is the html underline element tag. * is an italic text in markdown
htmlToMarkdownConverter.Config.UnknownTagsReplacer.Add(newReplacer)

// or 
htmlToMarkdownConverter.Config.UnknownTagsReplacer.Add("u", "*")

During conversion, it would convert this element:

<u>Some underline text</u>

To:

*Some underline text*


I made a rudimentary solution for my use case like this but I don't know how it should be approached if done properly:

var unsupportedElemsRegexFormatters = new Dictionary<string, string>
{
    {@"<strike class=""bb_strike"">((.|\n)*?)</strike>", "~~$1~~" },
    {@"<u>((.|\n)*?)</u>", "*$1*" }
};

// Some elements not supported by the converter need to be manually converted
foreach (var item in unsupportedElemsRegexFormatters)
{
    text = Regex.Replace(text, item.Key, item.Value);
}

@mysticmind
Copy link
Owner

You could do a regex as you have done or load the HTML via HtmlAgilityPack and replace it as you need. I think the idea what you have outlined, will implement it in the next release in a month or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants