Skip to content

add Support for parsing hyperlink from docx and rendering hyperlink in html & markdown#4

Closed
olivM wants to merge 2 commits intopaperdoc-dev:mainfrom
olivM:feature/parse-docx-hyperlink
Closed

add Support for parsing hyperlink from docx and rendering hyperlink in html & markdown#4
olivM wants to merge 2 commits intopaperdoc-dev:mainfrom
olivM:feature/parse-docx-hyperlink

Conversation

@olivM
Copy link
Copy Markdown
Contributor

@olivM olivM commented Apr 23, 2026

add Support for parsing hyperlink from docx and rendering hyperlink in html & markdown

#3

What
add Support for parsing hyperlink from docx and rendering hyperlink in html & markdown

Why
Hyperlink from docx were ignored

How
Added a new TextLink $link property for the TextRun class

Tests
DocxParserTest::test_parse_docx_with_hyperlink
HtmlRendererTest::test_renders_hyperlink
MarkdownRenderer::class

Breaking changes
no breaking changes, all previous tests are valids

olivM and others added 2 commits April 23, 2026 19:30
and rendering hyperlink in html & markdown
Co-authored-by: Copilot <copilot@github.com>
@AkramZerarka
Copy link
Copy Markdown
Contributor

AkramZerarka commented Apr 24, 2026

Hi @olivM — thanks a ton for this, and sorry for the slow first round. 🙏

Hyperlink support wasn't on the short-term roadmap, but your PR moved it right to the top. The DOCX parser is admittedly a bit dense; the part you were looking for lives in DocxParser::parseRuns(), which walks each <w:p> child and dispatches on localName. Your approach of adding a <w:hyperlink> branch that re-enters parseRuns() with a TextLink context is exactly how we'd have done it — clean and minimal.

What's happening now

We've opened a superseding PR (#5) that keeps both your commits as-is you stay the author and adds a small review pass on top:

  • DocxParser — when r:id is missing or unresolved (internal bookmarks, broken relationships), the inner runs used to be silently dropped. They're now always emitted, with w:anchor and w:tooltip recognised.
  • HtmlRenderer — when a run had both a style and a link, the style was being discarded on the <a>. Fixed. Also added target="_blank" rel="noopener noreferrer" for external schemes (tabnabbing protection) and a title attribute.
  • MarkdownRenderer — labels now escape [/] and URLs with whitespace/parens are wrapped with <…>, so the output stays valid markdown even with edge-case content.
  • TextLink — enriched with optional anchor and title, plus getHref() and isExternal(). Your TextLink::make()->setUrl(...) calls are fully backwards-compatible.

Release

Shipping as v0.3.8. You're now the first external contributor listed in CONTRIBUTORS.md and in the README, and the 0.3.8 CHANGELOG entry has a dedicated Credits section with your name and a link to this PR.

Closing this one in favour of #5. Thanks again this is genuinely useful work, and it makes paperdoc-lib better for everyone doing .docx → .md pipelines.

Looking forward to more 👋

AkramZerarka added a commit that referenced this pull request Apr 24, 2026
Supersedes #4. First external contribution by @olivM.

Adds DOCX hyperlink parsing, HTML rendering with style preservation + safe rel/target, Markdown rendering with proper escaping, and a TextLink value object with anchor/title support.

See CHANGELOG.md v0.3.8 for details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants