Skip to content

4.0.0-beta1 preview release

Pre-release
Pre-release
Compare
Choose a tag to compare
@dmbaturin dmbaturin released this 07 Feb 18:22

Soupault 4.0.0 release will bring multiple new features, including:

  • Taxonomy and pagination generators written in Lua.
  • Ability to take over page processing stages and augment them with Lua hooks.
  • An option to make index data accessible from content pages.
  • Ability to mark certain index fields as required.
  • Multiple new Lua plugin functions, including Unicode-aware string length and truncation, new HTML helpers, and more.

Documentation and examples for the new features will be provided later. The goal of 4.0.0-beta1 is to make sure that old functionality is still working as before for everyone, save for a minor breaking change.

Breaking changes

Index view option index_item_template does not have a default value anymore.

If you have an index view without either index_template, index_item_template, or index_processor
and it's working fine for you, you need to add its original default value to your view explicitly
to make it work like before.

[index.views.some_view]
  index_item_template = '''<div> <a href="{{url}}">{{title}}</a> </div>'''

New features

Required index fields

It's now possible to mark certain index fields as required. If a required field is not present in a page,
soupault will display an error and stop.

Example:

[index.fields.title]
  selector = ["h1#post-title", "h1"]
  required = true

Page processing hooks

The first big feature in this release is the long-promised system of page processing hooks.

As of this release, there are the following hooks: pre-parse, pre-process, post-index, render, and save.

  • pre-parse: operates on the page text before it's parsed, must place the modified page source in the page_source variable.
  • pre-process: operates on the page element tree just after parsing, may modify the page variable and set target_dir and target_file variables.
  • post-index: operates on the page element tree after index data extraction, can add more fields and override fields in the index_entry variable.
  • render: takes over the rendering process, must put rendered page text in the page_source variable.
  • save: takes over the page output process.

For example, this is how you can do global variable substituion with a pre-parse hook:

[hooks.pre-parse]
  lua_source = '''
soupault_release = soupault_config["custom_options"]["latest_soupault_version"]
Log.debug("running pre-parse hook")
page_source = Regex.replace_all(page_source, "\\$SOUPAULT_RELEASE\\$", soupault_release)
'''

Lua index processors

It's now possible to write index generators in Lua. Lua code can be given inline inside an index view using the lua_source option or loaded from an external file (using the file option).

For example, this is a reimplementation of the built-in index_template option in Lua:

[index.views.blog]
  index_selector = "div#blog-index"

  index_template = """
    {% for e in entries %}
      <div class="entry">
        <a href="{{e.url}}">{{e.title}}</a> (<time>{{e.date}}</time>)
      </div>
    {% endfor %}
  """

  lua_source = """
    env = {}
    rendered_entries = HTML.parse(String.render_template(config["index_template"], env))
    container = HTML.select_one(page, config["index_selector"])
    HTML.append_child(container, rendered_entries)
  """

To support passing custom options to Lua index processors, index views now allow arbitrary options, like widgets config sections.

The most important advtange of Lua index processors is that they can generate new pages, so it's now possible to generate taxonomies and paginated indices in a single soupault run and without any external tools.

Access to index data from content pages

There's a new index.index_first option.

[index]
  index_first = true

When set to true, that option will make soupault perform a reduced first pass where it does the bare minimum of work required
to produce the site index. It will read pages and run widgets specified in index.extract_after_widgets, but will not finish
processing any pages and will not write them to disk.

Then it will perform a second pass to actually render the website. Every plugin running on every page can access that page's index entry
via a new index_entry variable. This way you can avoid having to store index data externally and run soupault twice,
even though a certain amount of work is still done twice behind the scenes.

Inline Lua plugins

Lua plugin code can now be inlined in soupault.toml:

[plugins.test-plugin]
  lua_source = '''
    Log.debug("Test plugin!")
    Plugin.exit("Test plugin executed successfully")
'''

New Lua plugin API functions and variables

New variables

  • target_file (path to the output file, relative to the current working directory).
  • index_entry (the index entry of the page being processed if index.index_first = true, otherwise it's nil).

New functions

  • String.slugify_soft(string) replaces all whitespace with hyphens, but doesn't touch any other characters.
  • HTML.to_string(etree) and HTML.pretty_print(etree) return string representations of element trees, useful for save hooks.
  • HTML.create_document() creates an empty element tree.
  • HTML.clone_page(etree) make a copy of a complete element tree.
  • HTML.append_root(etree, node) adds a node after the last element.
  • HTML.child_count(elem) returns the number of children of an element.
  • HTML.unwrap(elem) yanks the child elements out of a parent and inserts them in its former place.
  • Table.take(table, limit) removes up to limit items from a table and returns them.
  • Table.chunks(table, size) splits a table into chunks of up to size items.
  • Table.has_value(table, value) returns true if value is present in table.
  • Table.apply_to_values(func, table) applies function func to every value in table (a simpler version of Table.apply if you don't care about keys).
  • Table.keys(table) returns a list of all keys in table.
  • Sys.list_dir(path) returns a list of all files in path.

Unicode string functions

  • String.length is now Unicode-aware, the old implementation is still available as String.length_ascii
  • String.truncate is now Unicode-aware, the old implementation is still available as String.truncate_ascii

Bug fixes

  • The numeric index entry sorting method works correctly again.
  • Looking up values in nested inline tables works correctly now (fixed in OTOML 1.0.1)
  • Fixed an unhandled exception when handling misconfigured index views.
  • Fixed a possible unhandled exception during page processing.

Misc

  • If index.sort_by is not set, entries are now sorted by their url field rather than displayed in arbitrary order.
  • Sys.list_dir correctly handles errors when the argument is not a directory.
  • Preprocessor commands are now quoted in debug logs for better readability (e.g. running preprocessor "cmark --unsafe --smart"...).