Cannot extract relative reference links in Markdown

Test case:

```markdown
Inline [link1](target1.md)

Reference [link2][link2]

[link2]: target2.md

Collapsed [link3][]

[link3]: target3.md

Shortcut [link4]

[link4]: target4.md

Shortcut [link5] with full URL

[link5]: file:///path/to/target5.md
```

Save this as `~/junk/lychee/baz.md` and process it with `lychee baz.md --dump -vv`, and it prints:

```
file:///home/wks/junk/lychee/target1.md (baz.md)
file:///path/to/target5.md (baz.md)
```

It successfully extracts the link to `target1.md` and resolved it as a relative URL starting with `file:///...`.

But link2 to link4 failed to be extracted.  Link5 points to a full URL instead of a filename, and it is extracted, too.

I think the problem is in the handling of links in the markdown parser.

```rust
// excerpt from lychee-lib/src/extract/markdown.rs

pub(crate) fn extract_markdown(input: &str, include_verbatim: bool) -> Vec<RawUri> {
// ...
                match link_type {
                    LinkType::Inline => {
                        Some(vec![RawUri {
                            text: dest_url.to_string(),
                            element: Some("a".to_string()),
                            attribute: Some("href".to_string()),
                        }])
                    }
                    LinkType::Reference |
                    LinkType::ReferenceUnknown |
                    LinkType::Collapsed|
                    LinkType::CollapsedUnknown |
                    LinkType::Shortcut |
                    LinkType::ShortcutUnknown |
                    LinkType::Autolink |
                    LinkType::Email =>
                     Some(extract_raw_uri_from_plaintext(&dest_url)),
```

For inline links, it simply treats `dest_url` as the href.  But for all other kinds of links, it will invoke `extract_raw_uri_from_plaintext` which uses some kind of heuristics to detect URLs.  So anything that doesn't look like a URL in `[label]: foo_bar_baz.md` are ignored.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Cannot extract relative reference links in Markdown #1657

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Cannot extract relative reference links in Markdown #1657

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions