Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic links? #494

Closed
vpzomtrrfrt opened this issue Oct 10, 2020 · 5 comments
Closed

Automatic links? #494

vpzomtrrfrt opened this issue Oct 10, 2020 · 5 comments

Comments

@vpzomtrrfrt
Copy link

Is it possible to use this library and have plain URLs automatically become links like they do in GFM?

@marcusklaas
Copy link
Collaborator

marcusklaas commented Nov 8, 2020

This is not supported by pulldown directly, but since this is essentially an exercise in text search, it's relatively straightforward to build directly on top of the library. An example of how this could be done follows below.

extern crate pulldown_cmark;

use std::io::Write as _;

use pulldown_cmark::{html, CowStr, Event, LinkType, Parser, Tag};
use regex::Regex;

static URL_REGEX: &str = r#"((https?|ftp)://|www.)[^\s/$.?#].[^\s]*[^.^\s]"#;

fn main() {
    let markdown_input: &str =
        "Hi! This is an URL: http://github.com. Another one is www.google.com.";
    println!("Parsing the following markdown string:\n{}", markdown_input);

    let autolinker = AutoLinker {
        iter: Parser::new(markdown_input),
        state: AutoLinkerState::Clear,
        regex: Regex::new(URL_REGEX).unwrap(),
    };

    // Write to anything implementing the `Write` trait. This could also be a file
    // or network socket.
    let stdout = std::io::stdout();
    let mut handle = stdout.lock();
    handle.write_all(b"\nHTML output:\n").unwrap();
    html::write_html(&mut handle, autolinker).unwrap();
}

enum LinkState {
    Open,
    Label,
    Close,
}

enum AutoLinkerState<'a> {
    Clear,
    Link(LinkState, CowStr<'a>, CowStr<'a>),
    TrailingText(CowStr<'a>),
}

struct AutoLinker<'a, I> {
    iter: I,
    state: AutoLinkerState<'a>,
    regex: Regex,
}

impl<'a, I> Iterator for AutoLinker<'a, I>
where
    I: Iterator<Item = Event<'a>>,
{
    type Item = Event<'a>;

    fn next(&mut self) -> Option<Self::Item> {
        let text = match std::mem::replace(&mut self.state, AutoLinkerState::Clear) {
            AutoLinkerState::Clear => match self.iter.next() {
                Some(Event::Text(text)) => text,
                x => return x,
            },
            AutoLinkerState::TrailingText(text) => text,
            AutoLinkerState::Link(link_state, link_text, trailing_text) => match link_state {
                LinkState::Open => {
                    self.state = AutoLinkerState::Link(
                        LinkState::Label,
                        link_text.clone(),
                        trailing_text.clone(),
                    );
                    return Some(Event::Start(Tag::Link(
                        LinkType::Inline,
                        link_text,
                        "".into(),
                    )));
                }
                LinkState::Label => {
                    self.state = AutoLinkerState::Link(
                        LinkState::Close,
                        link_text.clone(),
                        trailing_text.clone(),
                    );
                    return Some(Event::Text(link_text));
                }
                LinkState::Close => {
                    self.state = AutoLinkerState::TrailingText(trailing_text);
                    return Some(Event::End(Tag::Link(
                        LinkType::Inline,
                        link_text,
                        "".into(),
                    )));
                }
            },
        };

        match self.regex.find(&text) {
            Some(reg_match) => {
                let link_text = reg_match.as_str();
                let leading_text = &text.as_ref()[..reg_match.start()];
                let trailing_text = &text.as_ref()[reg_match.end()..];

                self.state = AutoLinkerState::Link(
                    LinkState::Open,
                    link_text.to_owned().into(),
                    trailing_text.to_owned().into(),
                );

                Some(Event::Text(leading_text.to_owned().into()))
            }
            None => Some(Event::Text(text)),
        }
    }
}

This produces the following output:

Parsing the following markdown string:
Hi! This is an URL: http://github.com. Another one is www.google.com.

HTML output:
<p>Hi! This is an URL: <a href="http://github.com">http://github.com</a>. Another one is <a href="www.google.com">www.google.com</a>.</p>

I hope that this answers your question. If it does not, please feel free to reopen this issue.

@vpzomtrrfrt
Copy link
Author

Seems to work, thanks!

I'm going to put this into a crate under CC0 unless you have a problem with that

@marcusklaas
Copy link
Collaborator

That's cool with me. I release the code above to the public domain.

@WesleyAC
Copy link

WesleyAC commented Apr 7, 2021

I know this issue is closed, but it would be really nice if this was an option that pulldown_cmark provided. Since it isn't, I'll copy the code from this issue (thanks for providing it :) ), but it seems like this is a sort of common thing to want, and it'd be nice if it came built-in, with sufficient warning next to the option that it may be slow/etc.

@osmarks
Copy link

osmarks commented Dec 4, 2023

The code suggested here breaks for some links with underscores in them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants