Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more markdown syntax #1704

Closed
florianm opened this issue May 8, 2014 · 6 comments
Closed

Support more markdown syntax #1704

florianm opened this issue May 8, 2014 · 6 comments

Comments

@florianm
Copy link
Contributor

florianm commented May 8, 2014

In markdown fields, support at least superscript and subscript, e.g. ^superscript^ and ~subscript~. Ideally, support pandoc's markdown flavour.

@formwandler
Copy link

See also related #1332 and issue #3 from ckanext-pages.

@florianm
Copy link
Contributor Author

florianm commented May 8, 2014

thanks @formwandler!

@rossjones
Copy link
Contributor

Does anybody fancy submitting a PR for this idea? It's likely to be closed due to old age otherwise.

@TkTech TkTech closed this as completed May 10, 2016
@davidread
Copy link
Contributor

This issue was closed due to inactivity. Feel free to reopen if you have more feedback or are interested it working on it

@thorge
Copy link

thorge commented Jun 12, 2023

I would like to have support for markdown tables. Of course it would be great to have the possibility to add more markdown features in general. Gladly the Markdown package used by CKAN can handle a lot of features via extensions.

On the python-markdown page it says:

If you have a typical install of Python-Markdown, these extensions are already available to you using the "Entry Point" name listed in the second column below.

So I guess you would be sufficient to convert all markdown() calls in CKAN to:

markdown(some_text, extensions=list_of_active_extensions)

A config variable that defines the desired active extensions would be handy.

Any thoughts on this?

@themowski
Copy link

themowski commented Jan 9, 2024

@thorge is definitely on the right track in terms of adding better Markdown support to CKAN. Allowing users to specify some of the built-in extensions should be pretty straightforward and would improve the situation. However, some of the built-in extensions (e.g., CodeHilite) require additional Python packages to be installed and/or additional configuration in order to work correctly -- it's not clear whether CKAN should be updated to "prepare" for users wanting to use those packages, or whether that integration would need to be an end-user's problem (or just "unsupported by CKAN").

From an implementation standpoint, in addition to passing extensions=[] to calls to the markdown() function in ckan/lib/helpers.py, we would also need to add additional allowed tags to the ckan.lib.helpers.MARKDOWN_TAGS set. This variable controls what bleach.clean() strips or keeps when it sanitizes the HTML that Markdown produces.

Here's a sample diff for anyone who wants to try patching this on their own. This diff was made against the ckan-2.10.1 tag and demonstrates enabling the Tables and Fenced Code Blocks extensions.

$ git diff
diff --git a/ckan/lib/helpers.py b/ckan/lib/helpers.py
index 3ecc187e2..32fdedc75 100644
--- a/ckan/lib/helpers.py
+++ b/ckan/lib/helpers.py
@@ -71,7 +71,8 @@ log = logging.getLogger(__name__)
 MARKDOWN_TAGS = set([
     'del', 'dd', 'dl', 'dt', 'h1', 'h2',
     'h3', 'img', 'kbd', 'p', 'pre', 's',
-    'sup', 'sub', 'strike', 'br', 'hr'
+    'sup', 'sub', 'strike', 'br', 'hr',
+    'table', 'thead', 'tbody', 'td', 'tr',
 ]).union(ALLOWED_TAGS)
 
 MARKDOWN_ATTRIBUTES = copy.deepcopy(ALLOWED_ATTRIBUTES)
@@ -2169,11 +2170,11 @@ def render_markdown(data: str,
     if not data:
         return ''
     if allow_html:
-        data = markdown(data.strip())
+        data = markdown(data.strip(), extensions=["tables", "fenced_code"])
     else:
         data = RE_MD_HTML_TAGS.sub('', data.strip())
         data = bleach_clean(
-            markdown(data), strip=True,
+            markdown(data, extensions=["tables", "fenced_code"]), strip=True,
             tags=MARKDOWN_TAGS,
             attributes=MARKDOWN_ATTRIBUTES)
     # tags can be added by tag:... or tag:"...." and a link will be made

One thing to note -- the tables that get generated by this end up being unstyled, so there's also probably some CSS work that needs to happen, and that would need to be done carefully to avoid changing things elsewhere in CKAN.

All that said, it may also be worth looking at what other packages are out there for supporting Markdown parsing in Python these days. Python-Markdown aims to be compliant with the original spec, and while there's certainly elegance in that, power users will find that spec limited compared to what's supported "natively" elsewhere. We're coming up on 10 years (!) since this issue was opened (and 8 since it was closed due to inactivity), so there may be value in a fresh approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants