Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Option to convert .md links to .html #1094

Closed
venthur opened this issue Jan 8, 2021 · 5 comments
Closed

Option to convert .md links to .html #1094

venthur opened this issue Jan 8, 2021 · 5 comments
Labels
3rd-party Should be implemented as a third party extension.

Comments

@venthur
Copy link
Contributor

venthur commented Jan 8, 2021

Hi,

when converting a tree of markdown documents with internal links, it would be useful to convert the link targets from .md to .html as well.

E.g.:

[some link](link.md)

should point to link.html. I understand that that might not be the desired default behavior but it would surely be a useful addition.

@facelessuser
Copy link
Collaborator

You could certainly write an extension to handle this, but this is not standard Markdown behavior, so you are unlikely to see this integrated by default in Python Markdown.

@waylan waylan added the 3rd-party Should be implemented as a third party extension. label Jan 8, 2021
@waylan waylan closed this as completed Jan 8, 2021
@ricmua
Copy link

ricmua commented May 27, 2022

Not sure if you resolved this, @venthur, and I didn't see any extensions. I have similar need / desire and I think this does what we want:

import os
from markdown import markdown
from markdown import Extension
from markdown.inlinepatterns import LinkInlineProcessor
from markdown.inlinepatterns import LINK_RE

class CustomLinkInlineProcessor(LinkInlineProcessor):
    
    def getLink(self, *args, **kwargs):
        (href, title, index, handled) = super().getLink(*args, **kwargs)
        parts = os.path.splitext(href)
        ext = '.html' if (parts[1] == '.md') else parts[1]
        href = ''.join((parts[0], ext))
        return (href, title, index, handled)
    
  

class CustomLinkExtension(Extension):
    
    def extendMarkdown(self, md):
        md.inlinePatterns.deregister('link')
        pattern = CustomLinkInlineProcessor(LINK_RE, md)
        md.inlinePatterns.register(pattern, 'link', 160)
        

data  = 'Some text and a [link](path/to/a/document.md).\n'
data += 'And a [normal link](http://www.google.com/image.jpg).\n'
data += 'More text.'
html = markdown(data, extensions=[CustomLinkExtension()])
print(f'ORIGINAL:\n{data}\nCONVERTED:\n{html}')

I'm brand new to Python-Markdown as of yesterday, and I haven't (yet) packaged / tested this well.

EDIT: Created a repository for this.

@venthur
Copy link
Contributor Author

venthur commented May 28, 2022

Interesting, i solved it slightly differently using a TreeProcessor:

class MarkdownLinkTreeprocessor(Treeprocessor):
    """Converts relative links to .md files to .html
    """

    def run(self, root):
        for element in root.iter():
            if element.tag == 'a':
                url = element.get('href')
                converted = self.convert(url)
                element.set('href', converted)
        return root

    def convert(self, url):
        scheme, netloc, path, query, fragment = urlsplit(url)
        logger.debug(
            f'{url}: {scheme=} {netloc=} {path=} {query=} {fragment=}'
        )
        if (scheme or netloc or not path):
            return url
        if path.endswith('.md'):
            path = path[:-3] + '.html'

        url = urlunsplit((scheme, netloc, path, query, fragment))
        return 

@ricmua
Copy link

ricmua commented May 28, 2022

Excellent. Thank you.

@ricmua
Copy link

ricmua commented May 30, 2022

Created a simple extension repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3rd-party Should be implemented as a third party extension.
Projects
None yet
Development

No branches or pull requests

4 participants