Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mermaid and Kroki support #41

Merged
merged 5 commits into from
Sep 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ This Python package
* Image references (uploaded as Confluence page attachments)
* [Table of Contents](https://docs.gitlab.com/ee/user/markdown.html#table-of-contents)
* [Admonitions](https://python-markdown.github.io/extensions/admonition/) (converted into *info*, *tip*, *note* and *warning* Confluence panels)
* [Mermaid diagrams](sample/with_mermaid.md)
* [Markdown in HTML](https://python-markdown.github.io/extensions/md_in_html/)

## Getting started
Expand All @@ -41,14 +42,15 @@ In order to get started, you will need

### Setting up the environment

Confluence organization domain, base path, username, API token and space key can be specified at runtime or set as Confluence environment variables (e.g. add to your `~/.profile` on Linux, or `~/.bash_profile` or `~/.zshenv` on MacOS):
Confluence organization domain, base path, username, API token, space key, and Kroki server URL can be specified at runtime or set as Confluence environment variables (e.g. add to your `~/.profile` on Linux, or `~/.bash_profile` or `~/.zshenv` on MacOS):

```bash
export CONFLUENCE_DOMAIN='instructure.atlassian.net'
export CONFLUENCE_PATH='/wiki/'
export CONFLUENCE_USER_NAME='levente.hunyadi@instructure.com'
export CONFLUENCE_API_KEY='0123456789abcdef'
export CONFLUENCE_SPACE_KEY='DAP'
export KROKI_SERVER_URL=https://kroki.your.env
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would talk about Mermaid and Kroki in a separate section (e.g. Embedding Mermaid diagrams). This would keep setup instructions simple for basic use, and advanced users could explore additional features as they read along.

```

On Windows, these can be set via system properties.
Expand Down Expand Up @@ -133,6 +135,10 @@ optional arguments:
--generated-by GENERATED_BY
Add prompt to pages (default: 'This page has been generated with a tool.').
--no-generated-by Do not add 'generated by a tool' prompt to pages.
--render-mermaid Render Mermaid diagrams as image files and add as attachments.
--no-render-mermaid Inline mermaid diagram in the confluence page.
--render-mermaid-format {png,svg}
Format for rendering mermaid diagrams (default: 'png').
--ignore-invalid-url Emit a warning but otherwise ignore relative URLs that point to ill-specified locations.
--local Write XHTML-based Confluence Storage Format files locally without invoking Confluence API.
```
22 changes: 22 additions & 0 deletions md2conf/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,26 @@ def main() -> None:
const=None,
help="Do not add 'generated by a tool' prompt to pages.",
)
parser.add_argument(
"--render-mermaid",
dest="render_mermaid",
action="store_true",
default=True,
help="Render Mermaid diagrams as image files and add as attachments.",
)
parser.add_argument(
"--no-render-mermaid",
dest="render_mermaid",
action="store_false",
help="Inline mermaid diagram in the confluence page.",
)
parser.add_argument(
"--render-mermaid-format",
dest="kroki_output_format",
choices=["png", "svg"],
default="png",
help="Format for rendering mermaid diagrams (default: 'png').",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you get Confluence work with SVG? I have had trouble embedding SVG in wiki pages, which is why md2conf is currently referencing a PNG image instead of an SVG image when both formats are found.

)
parser.add_argument(
"--ignore-invalid-url",
action="store_true",
Expand Down Expand Up @@ -109,6 +129,8 @@ def main() -> None:
ignore_invalid_url=args.ignore_invalid_url,
generated_by=args.generated_by,
root_page_id=args.root_page,
render_mermaid=args.render_mermaid,
kroki_output_format=args.kroki_output_format,
)
properties = ConfluenceProperties(
args.domain, args.path, args.username, args.apikey, args.space
Expand Down
41 changes: 33 additions & 8 deletions md2conf/api.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import io
import json
import logging
import mimetypes
Expand Down Expand Up @@ -61,7 +62,6 @@ def removeprefix(string: str, prefix: str) -> str:
else:
return string


LOGGER = logging.getLogger(__name__)


Expand Down Expand Up @@ -186,24 +186,30 @@ def upload_attachment(
page_id: str,
attachment_path: Path,
attachment_name: str,
raw_data: Optional[bytes] = None,
comment: Optional[str] = None,
*,
space_key: Optional[str] = None,
force: bool = False,
) -> None:
content_type = mimetypes.guess_type(attachment_path, strict=True)[0]

if not attachment_path.is_file():
if not raw_data and not attachment_path.is_file():
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raw_data and attachment_path seem to be exclusive. I would assume the function would be expected to raise an exception when both are present.

raise ConfluenceError(f"file not found: {attachment_path}")

try:
attachment = self.get_attachment_by_name(
page_id, attachment_name, space_key=space_key
)

if not force and attachment.file_size == attachment_path.stat().st_size:
LOGGER.info("Up-to-date attachment: %s", attachment_name)
return
if not raw_data:
if not force and attachment.file_size == attachment_path.stat().st_size:
LOGGER.info("Up-to-date attachment: %s", attachment_name)
return
else:
if not force and attachment.file_size == len(raw_data):
LOGGER.info("Up-to-date embedded image: %s", attachment_name)
return

id = removeprefix(attachment.id, "att")
path = f"/content/{page_id}/child/attachment/{id}/data"
Expand All @@ -213,17 +219,36 @@ def upload_attachment(

url = self._build_url(path)

with open(attachment_path, "rb") as attachment_file:
if not raw_data:
with open(attachment_path, "rb") as attachment_file:
file_to_upload = {
"comment": comment,
"file": (
attachment_name, # will truncate path component
attachment_file,
content_type,
{"Expires": "0"},
),
}
LOGGER.info("Uploading attachment: %s", attachment_name)
response = self.session.post(
url,
files=file_to_upload, # type: ignore
headers={"X-Atlassian-Token": "no-check"},
)
Comment on lines +224 to +238
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much of this looks like a repetition of the block below. Perhaps we might want to encapsulate this into a function?

else:
LOGGER.info("Uploading raw data: %s", attachment_name)

file_to_upload = {
"comment": comment,
"file": (
attachment_name, # will truncate path component
attachment_file,
io.BytesIO(raw_data), # type: ignore
content_type,
{"Expires": "0"},
),
}
LOGGER.info("Uploading attachment: %s", attachment_name)

response = self.session.post(
url,
files=file_to_upload, # type: ignore
Expand Down
10 changes: 9 additions & 1 deletion md2conf/application.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,9 +131,17 @@ def _get_or_create_page(
)

def _update_document(self, document: ConfluenceDocument, base_path: Path) -> None:

for image in document.images:
self.api.upload_attachment(
document.id.page_id, base_path / image, attachment_name(image), ""
document.id.page_id, base_path / image, attachment_name(image),
)

for image,data in document.embedded_images.items():
print(image)
self.api.upload_attachment(
document.id.page_id, Path('EMB') / image, attachment_name(image),
raw_data=data,
)

content = document.xhtml()
Expand Down
87 changes: 69 additions & 18 deletions md2conf/converter.py
Original file line number Diff line number Diff line change
@@ -1,27 +1,30 @@
# mypy: disable-error-code="dict-item"

import hashlib
import importlib.resources as resources
import logging
import os.path
import pathlib
import re
import sys
import uuid
from dataclasses import dataclass
from typing import Dict, List, Optional, Tuple
from typing import Dict, List, Optional, Tuple, Literal
from urllib.parse import ParseResult, urlparse, urlunparse

import lxml.etree as ET
import markdown
from lxml.builder import ElementMaker

from md2conf import kroki

namespaces = {
"ac": "http://atlassian.com/content",
"ri": "http://atlassian.com/resource/identifier",
}
for key, value in namespaces.items():
ET.register_namespace(key, value)


HTML = ElementMaker()
AC = ElementMaker(namespace=namespaces["ac"])
RI = ElementMaker(namespace=namespaces["ri"])
Expand Down Expand Up @@ -142,6 +145,7 @@ def elements_from_strings(items: List[str]) -> ET._Element:
"kotlin",
"livescript",
"lua",
"mermaid",
"mathematica",
"matlab",
"objectivec",
Expand Down Expand Up @@ -222,6 +226,8 @@ class ConfluenceConverterOptions:
"""

ignore_invalid_url: bool = False
render_mermaid: bool = False
kroki_output_format: Literal['png', 'svg'] = 'png'


class ConfluenceStorageFormatConverter(NodeVisitor):
Expand All @@ -232,6 +238,7 @@ class ConfluenceStorageFormatConverter(NodeVisitor):
base_path: pathlib.Path
links: List[str]
images: List[str]
embedded_images: Dict[str, bytes]
page_metadata: Dict[pathlib.Path, ConfluencePageMetadata]

def __init__(
Expand All @@ -246,6 +253,7 @@ def __init__(
self.base_path = path.parent
self.links = []
self.images = []
self.embedded_images = {}
self.page_metadata = page_metadata

def _transform_link(self, anchor: ET._Element) -> None:
Expand Down Expand Up @@ -317,8 +325,8 @@ def _transform_image(self, image: ET._Element) -> ET._Element:
if path and is_relative_url(path):
relative_path = pathlib.Path(path)
if (
relative_path.suffix == ".svg"
and (self.base_path / relative_path.with_suffix(".png")).exists()
relative_path.suffix == ".svg"
and (self.base_path / relative_path.with_suffix(".png")).exists()
):
path = str(relative_path.with_suffix(".png"))

Expand Down Expand Up @@ -349,19 +357,57 @@ def _transform_block(self, code: ET._Element) -> ET._Element:
language = "none"
content: str = code.text or ""
content = content.rstrip()
return AC(
"structured-macro",
{
ET.QName(namespaces["ac"], "name"): "code",
ET.QName(namespaces["ac"], "schema-version"): "1",
},
AC("parameter", {ET.QName(namespaces["ac"], "name"): "theme"}, "Midnight"),
AC("parameter", {ET.QName(namespaces["ac"], "name"): "language"}, language),
AC(
"parameter", {ET.QName(namespaces["ac"], "name"): "linenumbers"}, "true"
),
AC("plain-text-body", ET.CDATA(content)),
)

if language == "mermaid":
if self.options.render_mermaid:
image_data = kroki.render(content, output_format=self.options.kroki_output_format)
image_hash = hashlib.md5(image_data).hexdigest()
image_filename = attachment_name(f"embedded/{image_hash}.{self.options.kroki_output_format}")
self.embedded_images[image_filename] = image_data
return AC(
"image",
{
ET.QName(namespaces["ac"], "align"): "center",
ET.QName(namespaces["ac"], "layout"): "center",
},
RI(
"attachment",
{ET.QName(namespaces["ri"], "filename"): image_filename},
),
)
else:
local_id = str(uuid.uuid4())
macro_id = str(uuid.uuid4())
return AC(
"structured-macro",
{
ET.QName(namespaces["ac"], "name"): "macro-diagram",
ET.QName(namespaces["ac"], "schema-version"): "1",
ET.QName(namespaces["ac"], "data-layout"): "default",
ET.QName(namespaces["ac"], "local-id"): local_id,
ET.QName(namespaces["ac"], "macro-id"): macro_id,
},
AC("parameter", {ET.QName(namespaces["ac"], "name"): "sourceType"}, "MacroBody"),
AC("parameter", {ET.QName(namespaces["ac"], "name"): "attachmentPageId"}),
AC("parameter", {ET.QName(namespaces["ac"], "name"): "syntax"}, "Mermaid"),
AC("parameter", {ET.QName(namespaces["ac"], "name"): "attachmentId"}),
AC("parameter", {ET.QName(namespaces["ac"], "name"): "url"}),
AC("plain-text-body", ET.CDATA(content)),
)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you put a return statement after this line with the payload specific to the language mermaid?

else:
return AC(
"structured-macro",
{
ET.QName(namespaces["ac"], "name"): "code",
ET.QName(namespaces["ac"], "schema-version"): "1",
},
AC("parameter", {ET.QName(namespaces["ac"], "name"): "theme"}, "Midnight"),
AC("parameter", {ET.QName(namespaces["ac"], "name"): "language"}, language),
AC(
"parameter", {ET.QName(namespaces["ac"], "name"): "linenumbers"}, "true"
),
AC("plain-text-body", ET.CDATA(content)),
)
Comment on lines +398 to +410
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you remove the nesting in an else block (and use a return statement above) such that this does not appear as a "phantom change"?


def _transform_toc(self, code: ET._Element) -> ET._Element:
return AC(
Expand Down Expand Up @@ -567,6 +613,8 @@ class ConfluenceDocumentOptions:
ignore_invalid_url: bool = False
generated_by: Optional[str] = "This page has been generated with a tool."
root_page_id: Optional[str] = None
render_mermaid: bool = False
kroki_output_format: str = 'png'


class ConfluenceDocument:
Expand Down Expand Up @@ -624,14 +672,17 @@ def __init__(

converter = ConfluenceStorageFormatConverter(
ConfluenceConverterOptions(
ignore_invalid_url=self.options.ignore_invalid_url
ignore_invalid_url=self.options.ignore_invalid_url,
render_mermaid=self.options.render_mermaid,
kroki_output_format=self.options.kroki_output_format,
),
path,
page_metadata,
)
converter.visit(self.root)
self.links = converter.links
self.images = converter.images
self.embedded_images = converter.embedded_images

def xhtml(self) -> str:
return _content_to_string(self.root)
Expand Down
28 changes: 28 additions & 0 deletions md2conf/kroki.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
import base64
from typing import Literal

import requests
import zlib

import os


def get_kroki_server() -> str:
return os.getenv('KROKI_SERVER_URL', 'https://kroki.io')


def render(source: str, output_format: Literal['png', 'svg'] = 'png') -> bytes:
compressed_source = zlib.compress(source.encode('utf-8'), 9)
encoded_source = base64.urlsafe_b64encode(compressed_source).decode('ascii')
kroki_server = get_kroki_server()
kroki_url = f"{kroki_server}/mermaid/{output_format}/{encoded_source}"
response = requests.get(kroki_url)

if response.status_code == 200:
if output_format == 'png':
return response.content
else:
return response.text.encode('utf-8')
else:
raise Exception(f"Failed to render Mermaid diagram. Status code: {response.status_code}")

Loading