Skip to content
Permalink
Browse files
fixed conflicts
  • Loading branch information
ralsina committed Jun 6, 2017
2 parents 8ba4937 + 156277a commit de8d483596401d8c6c53db1c757707f637ff3119
Showing with 57 additions and 2 deletions.
  1. +1 −0 CHANGES.txt
  2. +26 −0 docs/manual.txt
  3. +30 −2 nikola/plugins/compile/html.py
@@ -6,6 +6,7 @@ Features

* New METADATA_FORMAT option to choose preferred metadata format
(Nikola/YAML/TOML/Pelican) (Part of Issue #2801)
* Extract metadata from HTML meta and title tags like Pelican (Issue #1923)

New in v7.8.7
=============
@@ -418,6 +418,7 @@ other static site generators. The currently supported metadata formats are:
* TOML, between ``+++`` (Hugo)
* reST docinfo (Pelican)
* Markdown metadata extension (Pelican)
* HTML meta tags (Pelican)

You can add arbitrary meta fields in any format.

@@ -537,6 +538,30 @@ the `markdown metadata extension docs. <https://pythonhosted.org/Markdown/extens

Note that keys are converted to lowercase automatically.

HTML meta tags
``````````````

For HTML source files, metadata will be extracted from ``meta`` tags, and the title from the ``title`` tag.
Following Pelican's behaviour, tags can be put in a "tags" meta tag or in a "keywords" meta tag. Example:

.. code:: html

<html>
<head>
<title>My super title</title>
<meta name="tags" content="thats, awesome" />
<meta name="date" content="2012-07-09 22:28" />
<meta name="modified" content="2012-07-10 20:14" />
<meta name="category" content="yeah" />
<meta name="authors" content="Conan Doyle" />
<meta name="summary" content="Short version for index and feeds" />
</head>
<body>
This is the content of my super blog post.
</body>
</html>


Mapping metadata from other formats
```````````````````````````````````

@@ -549,6 +574,7 @@ For Pelican, use:
METADATA_MAPPING = {
"rest_docinfo": {"summary": "description", "modified": "updated"},
"markdown_metadata": {"summary": "description", "modified": "updated"}
"html_metadata": {"summary": "description", "modified": "updated"}
}

For Hugo, use:
@@ -28,12 +28,14 @@

from __future__ import unicode_literals

import os
import io
import os

import lxml.html

from nikola import shortcodes as sc
from nikola.plugin_categories import PageCompiler
from nikola.utils import makedirs, write_metadata
from nikola.utils import LocaleBorg, makedirs, map_metadata, write_metadata


class CompileHtml(PageCompiler):
@@ -84,3 +86,29 @@ def create_post(self, path, **kw):
fd.write(write_metadata(metadata))
fd.write('-->\n\n')
fd.write(content)

def read_metadata(self, post, file_metadata_regexp=None, unslugify_titles=False, lang=None):
"""Read the metadata from a post's meta tags, and return a metadata dict."""
if lang is None:
lang = LocaleBorg().current_lang
source_path = post.translated_source_path(lang)

with io.open(source_path, 'r', encoding='utf-8') as inf:
data = inf.read()

metadata = {}
doc = lxml.html.document_fromstring(data)
title_tag = doc.find('*//title')
if title_tag is not None:
metadata['title'] = title_tag.text
meta_tags = doc.findall('*//meta')
for tag in meta_tags:
k = tag.get('name').lower()
if not k:
continue
elif k == 'keywords':
k = 'tags'
metadata[k] = tag.get('content', '')
map_metadata(metadata, 'html_metadata', self.site.config)
return metadata

0 comments on commit de8d483

Please sign in to comment.