Skip to content

Latest commit

 

History

History
3915 lines (3465 loc) · 136 KB

all-posts.org

File metadata and controls

3915 lines (3465 loc) · 136 KB

Preparation

  1. Be the root directory for this Hugo site (the directory containing config.toml).
  2. Run
    hugo server --port 1111
        
  3. See the site served on “http://localhost:1111/”.

Post 1

Export this first post only by bringing point here and doing M-x org-hugo-export-wim-to-md.

Post 2

This post has no date.

Export this second post only by bringing point here and doing M-x org-hugo-export-wim-to-md.

Image

Image links

This is some text before the first heading of this post.

Unclickable image (works!)

/images/org-mode-unicorn-logo.png


To be fixed (Now fixed): The sub-headings in a post get exported as Heading 1 instead of Heading 2.

For example, this sub-section’s heading is exported as:

# Unclickable image

instead of

## Unclickable image

Solution: Above is fixed by setting HUGO_OFFSET_LEVEL to 1.

So the sub-heading title and the post title both get the Heading 1 tag and look the same size.

Clickable link that opens the image (works!)

Click here to see the unicorn

Clickable image that opens the image (works!)

Click below image to jump to the unicorn image. /images/org-mode-unicorn-logo.png

NOTE
file: has to be used in both Link and Description components of the Org link.

Link to image outside of standard Hugo static directory

../files-to-be-copied-to-static/static/images/copy-of-unicorn-logo.png

If you link to files outside of the Hugo static directory, ensure that the path contains /static/ if you would like to preserve the directory structure.

Example translations between outside static directory paths to the copied location inside static:

Outside staticCopied-to location inside staticExplanation
~/temp/static/images/foo.png<HUGO_BASE_DIR>/static/images/foo.pngIf the outside path has /static/ in it, the directory structure after that is preserved when copied.
~/temp/static/img/foo.png<HUGO_BASE_DIR>/static/img/foo.png(same as above)
~/temp/static/foo.png<HUGO_BASE_DIR>/static/foo.png(same as above)
~/temp/static/articles/zoo.pdf<HUGO_BASE_DIR>/static/articles/zoo.pdf(same as above)

Source path does not contain /static/

../files-to-be-copied-to-static/foo/copy-2-of-unicorn-logo.png

Outside staticCopied-to location inside staticExplanation
~/temp/bar/baz/foo.png<HUGO_BASE_DIR>/static/ox-hugo/foo.pngHere, as the outside path does not have /static/, the file is copied to the ox-hugo/ dir in Hugo static/ dir.
Note
The ox-hugo sub-directory name is because of the default value of org-hugo-default-static-subdirectory-for-externals.

Image captions

Some text before image.

/images/org-mode-unicorn-logo.png Some more text, after image.

Image with Hugo figure shortcode parameters set using ATTR_HTML

{{{oxhugoissue(17)}}}

Setting class parameter

/images/org-mode-unicorn-logo.png

Discussion

Setting alt parameter

Reference

/images/org-mode-unicorn-logo.png

Setting title parameter

/images/org-mode-unicorn-logo.png

Setting image caption

The image caption can be set in two ways.

  1. Using the Org #+CAPTION keyword
  2. Using #+ATTR_HTML: :caption my caption

The #+CAPTION is available will get the higher precedence. In the below image, caption is set using that:

Below, the same caption is set using the #+ATTR_HTML method instead:

Some text before image.

/images/org-mode-unicorn-logo.png Some more text, after image.

Setting image size

Setting :width parameter

The image width can be specified in pixels using the :width parameter. The height of the image will be resized proportionally.

Below image is shown 50 pixels wide.

/images/org-mode-unicorn-logo.png

Below image is shown 100 pixels wide.

/images/org-mode-unicorn-logo.png

Below image is shown with a width of 1000 pixels! Now the size of this image is 200px × 200px. But the image will still show up in 1000px × 1000px size, but obviously heavily pixelated!

/images/org-mode-unicorn-logo.png

Setting :height parameter

NOTE: Support for specifying height parameter to the Hugo figure shortcut was only added recently in hugo PR #4018. So setting this parameter will need hugo v0.31 or later.


The image height can be specified in pixels using the :height parameter. The weight of the image will be resized proportionally.

Below image is shown 50 pixels tall.

/images/org-mode-unicorn-logo.png

Below image is shown 100 pixels tall.

/images/org-mode-unicorn-logo.png

Below image is shown with a height of 1000 pixels! Now the size of this image is 200px × 200px. But the image will still show up in 1000px × 1000px size, but obviously heavily pixelated!

/images/org-mode-unicorn-logo.png

Setting both :width and :height

The NOTE above applies here too.. needs hugo v0.31 or later.

The figure sizes below are intentionally set mis-proportionally just for testing.

  • :width 100 :height 200

    /images/org-mode-unicorn-logo.png

  • :width 200 :height 100

    /images/org-mode-unicorn-logo.png

Multiple ATTR_HTML

Below in Org source:

#+HTML: <style>.foo img { border:2px solid black; }</style>
#+ATTR_HTML: :alt Go is fine though.
#+ATTR_HTML: :width 300 :class foo
[[file:https://golang.org/doc/gopher/pkg.png]]

Rendered this:

.foo img { border:2px solid black; }

https://golang.org/doc/gopher/pkg.png

NOTE: We cannot use :style in #+ATTR_HTML because Hugo does not yet support a style argument in the figure shortcode [Source].

So using #+HTML: <style>.foo img ... </style> and #+ATTR_HTML: :class foo as shown in the workaround above.

Other

Similarly, :link, :attr, :attrlink parameters in #+ATTR_HTML are also supported to set the corresponding parameter in the Hugo figure shortcode.

Setting heading anchors

Heading 1 of the post

Something

Heading 1.1 of the post

Something else

Heading 2 of the post

Something 2

Heading 2.1 of the post

Something 2.1

Post heading with crazy characters

Releasing version 1.1

Foo ( Bar ) Baz

(Foo)Bar.Baz&Zoo

Hey! I have a link here (Awesome!)

Este título es en español

Non-English titles

ÂÊÎÔÛ

ÁÉÍÓÚÝ

ÀÈÌÒÙ

ÄËÏÖÜ

ÃÐÑÕÞ

Ç

Headings with HTML

Checklist [1/3]

Above title would render to Checklist <code>[1/3]</code> in Markdown.

Item 1

Above would render to <span class="todo DONE_">DONE </span> Item 1 in Markdown.

Item 2

Above would render to <span class="todo TODO_">TODO </span> Item 2 in Markdown.

Item 3

Above would render to <span class="todo TODO_">TODO </span> Item 3 in Markdown.

Version 0.1 <2017-10-11 Wed>

Above title would render to ~Version 0.1 <span class=”timestamp-wrapper”><span class=”timestamp”>&lt;2017-10-11 Wed&gt;</span></span>~ in Markdown.

Title in Front Matter

Awesome title with “quoted text”

Testing a post with double quotes in the title.

Under_scores_in_title

Ensure that the underscores in title string of front matter do not get escaped.. foo_bar must not become foo\_bar.

Allow empty titles

This post will be exported without title in the front-matter because it is explicitly set to empty using :EXPORT_TITLE:.

En dash –, Em dash —, Horizontal ellipsis … in titles

This tests an ox-hugo feature that gets around an upstream limitation, where the Blackfriday smartDashes rendering does not happen in post titles ({{{hugoissue(4175)}}}).

Description meta-data with “quoted text”

Testing a post with double quotes in the description.

Excluded post

This post must not be exported as it is tagged noexport.

Section

Articles

Article 1

First article.

This will land in content/articles/ as the parent of this subtree sets EXPORT_HUGO_SECTION to articles. Note that the theme needs to define at least the single.html, either in the layouts/_default/ directory, or layouts/articles/, either in the Hugo base dir or the theme dir.

Article 2

Second article.

This will also land in content/articles/ the same way.

Emacs posts

Emacs Post 1

Here is the first post on Emacs.

Emacs Post 2

Here is the second post on Emacs.

Tables

Simple Table

h1h2
ab
cd

Table with narrowest cols

123
abe
cdf

Table with top border

1234
abeg
cdfh

Table with bottom border

1234
abeg
cdfh

Table with top and bottom border

1234
abeg
cdfh

Table with rule after first row

1234
abeg
cdfh

Table with borders and rule after first

1234
abeg
cdfh

Table with single column

h1
a
b

Table with single row

ab

Table with single cell

a

Table column alignment

Table with 3 rows

<r><l><c>
RightLeftNoCenter
Long Content ToSpread Out the WidthAlignmentOf the Table To See Alignment
RightLeftMarkerCenter

Table with 2 rows

<l><c><r>
LeftDefaultCenterRight
1234

Table with 1 row

A table with just 1 row with alignment markers is as good as just that row without the alignment markers. But hey, a test is a test.

<l><c><r>
LeftDefaultCenterRight

Table with 0 rows!

A table with zero rows, with just alignment markers, doesn’t make sense. But hey, a test is a test.

<l><c><r>

You should see no table exported above.

Source blocks

Code fence

Code-fenced source blocks (default behavior)

The source blocks are code-fenced by default.

*It is necessary to set the Hugo site config variable pygmentsCodeFences to true for syntax highlighting to work for fenced code blocks.*

Code-fenced source blocks

Here the source blocks are explicitly set to be code-fenced by setting the EXPORT_HUGO_CODE_FENCE property to t.

*It is necessary to set the Hugo site config variable pygmentsCodeFences to true for syntax highlighting to work for fenced code blocks.*

Code-fenced source blocks with backticks

This code block contains a fenced code block with 4 backticks:
````emacs-lisp
(message "Hello")
````

This code block contains a fenced code block with 3 backticks:

```emacs-lisp
(message "Hello again")
```

This code block contains no backticks:

(message "Hello again x2")

This code block again contains a fenced code block with 4 backticks:

````emacs-lisp
(message "Hello again x3")
````

This code block contains a fenced code block with 6 backticks:

``````emacs-lisp
(message "Hello again x4")
``````

This code block again contains a fenced code block with 3 backticks:

```emacs-lisp
(message "Hello again x5")
```

This code block once again contains no backticks:

(message "Hello again x6")

Highlight Shortcode

Source blocks with highlight shortcode

Note that to disable the code fence option, the value portion of the property needs to be left empty instead of setting to nil!
:PROPERTIES:
:EXPORT_HUGO_CODE_FENCE:
:END:

Source blocks with line number annotation

Cases

Default new line number start
Org source
<<src-block-n-default-start>>
Output
Specify new line number start
Org source
<<src-block-n-custom-start>>
Output
Default continued line numbers
Org source
<<src-block-n-default-continue>>
Output
Specify continued line numbers jump
Org source
<<src-block-n-custom-continue>>
Output

Source blocks with highlighting

Without line numbers

Org source
<<src-block-hl-without-n>>
Output

Above highlighting might look weird as the highlighting spans the full page/container width. This could be either called a bug in Hugo, or the HTML limitation.

A workaround is below.. use line numbers too!.

With line numbers not starting from 1

With line numbers enabled, the highlighting is limited to the width of the HTML table rows (because ox-hugo sets the linenos=table option in the highlight shortcode when line numbers are enabled).
Note 1
When using both, switches (like -n), and header args (like :hl_lines), the switches have to come first.
Note 2
The line numbers in the value for :hl_lines parameter is always with the starting line number reference of 1. That has no relation with the value of the line numbers displayed using the -n or +n switches!
Org source
<<src-block-hl-with-n-not-1>>
Output

With line numbers

Org source
<<src-block-hl-with-n>>
Output

Source block with caption

prefix = /dir/where/you/want/to/install/org # Default: /usr/share

Source block with list syntax in a list

An upstream bug in Blackfriday (Issue #239) caused fenced code blocks in lists to not render correctly if they contain Markdown syntax lists. ox-hugo provides a hack to get around that bug.

Below is an example of such a case:

  • List item 1
    - List item 1.1 in code block
    - List item 1.2 in code block
        
  • List item 2
    + List item 2.1 in code block
    + List item 2.2 in code block
        
  • List item 3

Source block without list syntax in a list

This case is not affected by Blackfriday Issue #239 as the fenced code block does not have Markdown syntax lists.
  • List item 1
    *abc*
    /def/
    =def=
        
  • List item 2

Source block with list syntax but not in a list

- list 1

Org Babel Results

str = 'a\tbc'
print(str[1:])
	bc

The whitespace before “bc” in the results block above should be preserved.

Indented source block

Test that indented source blocks export fine.
   (defun small-shell ()
	(interactive)
	(split-window-vertically)
	(other-window 1)
	(shrink-window (- (window-height) 12))
   (ansi-term))
  

More tests!

  • List item 1
    (message "I am in list at level-1 indentation")
        
    • List item 1.1
      (message "I am in list at level-2 indentation")
              
      • List item 1.1.1
        (message "I am in list at level-3 indentation")
                    
    • List item 2.1
      (message "I am in list back at level-2 indentation")
              
  • List item 2
    (message "I am in list back at level-1 indentation")
        
(message "And now I am at level-0 indentation")

Reference: {{{hugoissue(4006)}}}

Markdown source block with Hugo shortcodes

Shortcodes escaped

The figure shortcodes in the two Markdown source code blocks below should not be expanded.. they should be visible verbatim.

Code block using code fences

{{< figure src="http://orgmode.org/img/org-mode-unicorn-logo.png" >}}
{{% figure src="http://orgmode.org/img/org-mode-unicorn-logo.png" %}}

Code block using highlight shortcode

Here, the -n switch is added to the Org source block to auto-enable[fn:4] using the highlight shortcode.

{{< figure src="http://orgmode.org/img/org-mode-unicorn-logo.png" >}}
{{% figure src="http://orgmode.org/img/org-mode-unicorn-logo.png" %}}

Shortcodes not escaped

The figure shortcodes in the below example block should be expanded.. you should be seeing little unicorns below.

{{< figure src="http://orgmode.org/img/org-mode-unicorn-logo.png" >}}
{{% figure src="http://orgmode.org/img/org-mode-unicorn-logo.png" %}}

Above a #+BEGIN_EXAMPLE .. #+END_EXAMPLE block is chosen arbitrarily. The Hugo shortcodes will remain unescaped in any source/example block except for Markdown source blocks (annotated with md language).


*It is necessary to set the Hugo site config variable pygmentsCodeFences to true for syntax highlighting to work for fenced code blocks.*

Org Source Block

Test case for the case where user has set org-hugo-langs-no-descr-in-code-fences to a list containing the element org.

As this variable is dependent on user’s config, this post is not set to be exported by default.

The issue with Hugo will be seen if:

  • pygmentsCodeFences = true is set in the Hugo site config.toml,
  • a source block’s language is set to one that’s not supported by Pygments (like org, and thus the below example with source code language set to org), and
  • org-hugo-langs-no-descr-in-code-fences is set to a value not containing that lanaguage descriptor (org in this case).
# Org comment
Export this post after setting
=org-hugo-langs-no-descr-in-code-fences= to =(org)= and temporarily
removing the =noexport= tag.

Formatting

General

Below table shows the translation of Org markup to Markdown markup in the exported .md files.
OrgMarkdownIn Hugo rendered HTML
*bold***bold**bold
/italics/_italics_italics
==monospace==`monospace`monospace
~key-binding~`key-binding`key-binding
- if org-hugo-use-code-for-kbd is nil [default]
~key-binding~<kbd>key-binding</kbd>
- if org-hugo-use-code-for-kbd is non-nil
- Requires CSS to render the <kbd> tag as something special.
+strike-through+~~strike-through~~strike-through
_underline_<span class = "underline">underline</span>underline
- Requires CSS to render this underline class as an underline.

Keyboard tag

Use Org Code markup for kbd tag (default behavior)

This is the default behavior. So ~C-h f~ will show up as `C-h f` and then <code>C-h f</code> in the final Hugo generated HTML.

Example:

  • Few of Emacs help keybindings: C-h f, C-h v

Use Org Code markup for kbd tag

Here the Org code markup is explicitly specified to be used for <kbd> tag generation by setting EXPORT_HUGO_USE_CODE_FOR_KBD property to t. So ~C-h f~ will show up as <kbd>C-h f</kbd>.

Example:

  • Few of Emacs help keybindings: C-h f, C-h v

Don’t Use Org Code markup for kbd tag

Note that to disable the “use code for kbd” option, the value portion of the property needs to be left empty instead of setting to nil!
:PROPERTIES:
:EXPORT_HUGO_USE_CODE_FOR_KBD:
:END:

Here ~C-h f~ will show up as `C-h f` in Markdown and then <code>C-h f</code> in the final Hugo generated HTML.

Example:

  • Few of Emacs help keybindings: C-h f, C-h v

Multi-line bold

This works fine as the bold sentence does not include a newline.

*This is a sentence that should render completely in bold. It is broken across multiple lines (in Org source) because of auto-filling. But that should not break the bold rendering. But it does by default.*

If you do not see the above paragraph completely in bold, have below in your emacs config to fix it:

(with-eval-after-load 'org
  ;; Allow multiple line Org emphasis markup.
  ;; http://emacs.stackexchange.com/a/13828/115
  (setcar (nthcdr 4 org-emphasis-regexp-components) 20) ;Up to 20 lines, default is just 1
  ;; Below is needed to apply the modified `org-emphasis-regexp-components'
  ;; settings from above.
  (org-set-emph-re 'org-emphasis-regexp-components org-emphasis-regexp-components))

Example block

Simple

This is an example

Example blocks with line number annotation

Default new line number start

line 1
 line 2

Specify new line number start

line 20
line 21

Default continued line numbers

 line 22
line 23

Specify continued line numbers jump

line 33
line 34

Menu in front matter

Menu Meta Data in TOML Front Matter

Overriding few menu properties

For this post, we should see just the menu weight and identifier properties get overridden.

You need to set unique menu identifiers, else you get a Hugo error like this:

ERROR 2017/07/18 12:32:14 Two or more menu items have the same name/identifier in Menu "main": "menu-meta-data-in-yaml-front-matter".
Rename or set an unique identifier.

Overriding menu properties completely

For this post, we see that no menu properties are inherited from the parent; only the menu properties set in his subtree are effective.

Auto assign weights

Post with menu 1

Post with menu 2

Post with menu 3

Post with menu 4

Post with menu 5

Menu Meta Data in YAML Front Matter

White space in menu entry

Testing the addition of menu meta data to the YAML front matter. Here the front matter format is set to YAML using the HUGO_FRONT_MATTER_FORMAT key in property drawer.

Here there is white space in menu entry keyword.

White space in menu name

Testing the addition of menu meta data to the YAML front matter. Here the front matter format is set to YAML using the HUGO_FRONT_MATTER_FORMAT key in property drawer.

Here there is white space in menu name property.

Custom front matter

Custom front matter in one line

Custom front matter in multiple lines

:EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :baz zoo :EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :alpha 1 :EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :beta “two words” :EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :gamma 10From *(org) Property syntax*:

It is also possible to add to the values of inherited properties. The following results in the ‘genres’ property having the value “Classic Baroque” under the ‘Goldberg Variations’ subtree.

* CD collection
** Classic
:PROPERTIES:
:GENRES: Classic
:END:
*** Goldberg Variations
:PROPERTIES:
:Title:     Goldberg Variations
:Composer:  J.S. Bach
:Artist:    Glen Gould
:Publisher: Deutsche Grammophon
:NDisks:    1
:GENRES+:   Baroque
:END:

Custom front matter with list values

Custom front matter with list values in TOML

:EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :strings-symbols ‘(“abc” def “two words”) :EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :integers ‘(123 -5 17 1_234) :EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :floats ‘(12.3 -5.0 -17E-6) :EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :booleans ‘(true false)Issue # 99

Custom front matter with list values in YAML

:EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :strings-symbols ‘(“abc” def “two words”) :EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :integers ‘(123 -5 17 1_234) :EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :floats ‘(12.3 -5.0 -17E-6) :EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :booleans ‘(true false)Issue # 99

Outputs

Output HTML and JSON

Note: A single.json is required to be at a valid location in the template lookup hierarchy for the JSON outputs to be created.

Here’s the JSON output version of this page.

Setting empty outputs is fine

If the EXPORT_HUGO_OUTPUTS property is left empty/unset, ox-hugo will not set the outputs variable in the front-matter at all. So only the HTML output will be created (default).

Post body

Summary Splitter

Here is the summary.

Here is text after the summary splitter.

Dealing with underscores

This underscore should appear escaped in Markdown: _

This underscore is in a verbatim block, so it should not be escaped: _

This underscore also shouldn’t be escaped as it’s in an emoji code: 🙌

And these ones should be eventually removed and underline the text (Requires CSS to do so.) – Org syntax.

Nested bold and italics

  • This is italics, and *this is bold too*, and back to plain italics.
  • This is bold, and /this is italics too/, and back to plain bold.

Single and Double quotes

The strings in these two columns should look the exact same.
Rendered ActualRendered Expection
1‘This’‘This’
2“This”“This”
3“It’s”“It’s”
4‘It’s’‘It’s’
5http://localhost:1111/http://localhost:1111/
6http://localhost:1111/”.http://localhost:1111/”.

Note: There is a rendering issue is Row 5 above. That seems to be a corner case, because notice that Row 6 looks fine just because there was a trailing period. Will live with this issue for now.

ndash `and` mdash

The strings in these two columns should look the exact same.
CharacterRendered ActualRendered Expection
1Hyphena - ba - b
2Ndasha – ba – b
3Mdasha — ba — b
4Ellipsisa … ba … b

Title sanitization

This post has italics, monospace and bold in the title. This is to test that those markup characters do not end up in the title front matter of the post because HTML does not allow markup in the <title> section.

So the title of this post should read as “ndash and mdash”.

Footnotes Test

Footnotes 1

This is some text[fn:1].

Note to self: You *cannot* name an Org heading ‘Footnotes’; that’s reserved by Org to store all the footnotes.

Footnotes 2

This is some text[fn:2].

Footnotes in a row

This is some text[fn:1][fn:2].

Multiple references of same footnote

This is some text[fn:1]. This is some text[fn:1]. This is some text[fn:1].

Multi-line footnote

This is some text[fn:3].

Bind footnotes to the preceding word

{{{oxhugoissue(96)}}}

To test the fix for this, increase/decrease the width of the browser window showing this page so that the test lines below start wrapping around, and you will see that the footnote references will never be on their own on a new line.

Footnote ref at EOL

Last word, followed by FOOTNOTE PERIOD — Good Case A

  • As there is no space in-between “word FOOTNOTE PERIOD”, this text will stay unmodified.

a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1].

ab a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1].

abc a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1].

abcd a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1].

Last word, followed by FOOTNOTE space PERIOD — Bad Case A1

  • In this case, the space before the PERIOD at EOL is removed.

a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1] .

ab a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1] .

abc a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1] .

abcd a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1] .

Last word, followed by PERIOD space FOOTNOTE — Bad Case A2

  • In this case, the space before FOOTNOTE is replaced with &nbsp;.

a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a. [fn:1]

ab a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a. [fn:1]

abc a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a. [fn:1]

abcd a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a. [fn:1]

Last word, followed by space FOOTNOTE space PERIOD — Bad Case A3

  • This is a blend of Bad Case A1 and Bad Case A2 above.
  • In this case, the space before FOOTNOTE is replaced with &nbsp;, AND the space before the PERIOD at EOL is removed.

a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] .

ab a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] .

abc a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] .

abcd a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] .

abcde a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] .

Footnote NOT at EOL

Word, followed by FOOTNOTE PERIOD — Good Case B

  • As there is no space in-between “word FOOTNOTE PERIOD”, this text will stay unmodified.

a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1]. B b b.

ab a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1]. B b b.

abc a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1]. B b b.

abcd a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1].

Word, followed by FOOTNOTE space PERIOD — Bad Case B1

  • In this case, the space before the PERIOD at EOL is removed.

a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1] . B b b.

ab a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1] . B b b.

abc a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1] . B b b.

abcd a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a[fn:1] . B b b.

Word, followed by PERIOD space FOOTNOTE — Bad Case B2

  • In this case, the space before FOOTNOTE is replaced with &nbsp;.

a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a. [fn:1] B b b.

ab a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a. [fn:1] B b b.

abc a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a. [fn:1] B b b.

abcd a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a. [fn:1] B b b.

Word, followed by space FOOTNOTE space PERIOD — Bad Case B3

  • This is a blend of Bad Case B1 and Bad Case B2 above.
  • In this case, the space before FOOTNOTE is replaced with &nbsp;, AND the space before the PERIOD at EOL is removed.

a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] . B b b.

ab a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] . B b b.

abc a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] . B b b.

abcd a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] . B b b.

abcde a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a [fn:1] . B b b.

Tags

Basic tags

Testing tags set using Org tags in headings.

Prefer hyphens and allow spaces

Hyphens and spaces in tags

Hyphens and spaces in categories

The Org tags do not allow spaces. So the trick we use is replace double underscores with spaces.

So an Org tag @abc__def becomes Hugo category abc def.

Hyphens in Org tags

Prefer

Prefer Hyphen in Tags

Prefer Hyphen Categories

Don’t Prefer

Don’t Prefer Hyphen in Tags

Don’t Prefer Hyphen Categories

Spaces in Org Tags

Want Spaces

Spaces in tags

The Org tags do not allow spaces. So the trick we use is replace double underscores with spaces.

So an Org tag abc__def becomes Hugo tag abc def.

Spaces in categories

The Org tags do not allow spaces. So the trick we use is replace double underscores with spaces.

So an Org tag @abc__def becomes Hugo category abc def.

Don’t Want Spaces

No Spaces in tags

No Spaces in categories

Tags as Categories

Category A

Cat A post 1

This post is in category catA and tagged meow.

Cat A and cat B

This gets both categories catA and catB.

Do not leak post’s immediate sub-heading tag into the front-matter

Sub-heading 1

This is a special case where:

  • A post has a sub-heading as the first line in its body, and
  • That sub-heading has a tag too!

The passing case for this test would be that the unexpected_tag does not leak into the post’s front-matter.

Links

Links with target attribute

This link (to Hyperlinks chapter in Org manual) will open in a new tab as it is annotated with ~target=”_blank”~.

Here’s the same link but with ~target=”_self”~ annotation. So clicking it will open that link in this same tab!

http://orgmode.org/img/org-mode-unicorn-logo.png

Above is a link to an image. The width attribute of 10% though must apply only to the image, and not to the link, and the target attribute must apply only to the link, and not to the image.

Here’s the same link again, but this time there is no #+ATTR_HTML annotation. So the behavior will depend on the browser (typically an external link will open in a new tab automatically).

Within the same post (Internal links)

Link to headings by name

Alpha 101

  • Link (with description) to a heading with section number: Link to Beta 102 heading
  • Link (no description) to a heading without section number: <a href=”* Zeta 103”>* Zeta 103.

    The space after that * in the link is optional.. so this also works: *Zeta 103.

Beta 102

Gamma 102.1
Delta 102.1.1
Epsilon 102.1.2

Zeta 103

Links (no descriptions) to headings with section numbers

Link to a heading CUSTOM_ID

Obviously, all the =CUSTOM_ID=’s set by the user in this file have to be unique.

Heading 1

Heading 2

Links to Org targets

From (org) Internal links,
- one item
- <<target>>another item
Here we refer to item [[target]].

will output below (lorem-ipsum added to increase page content so that the link jump is evident):

  • one item
  • <<target>>another item

Scroll to the end of the below ‘lorem-ipsum’ block to find the test link.

Here we refer to item target.

Links to source blocks

From (org) Internal links,

If no dedicated target exists, the link will then try to match the exact name of an element within the buffer. Naming is done with the ‘#+NAME’ keyword, which has to be put in the line before the element it refers to, as in the following example

#+NAME: My Target
| a  | table      |
|----+------------|
| of | four cells |
  

Also, when targeting a #+NAME keyword, =#+CAPTION= keyword is mandatory in order to get proper numbering for source blocks, images and tables.

So the below code block:

#+CAPTION: Hello
#+NAME: code__hello
#+BEGIN_SRC emacs-lisp
(message "Hello")
#+END_SRC

*Here we refer to code snippet [[code__helloagain]].*

#+INCLUDE: "./all-posts.org::#lorem-ipsum" :only-contents t

#+CAPTION: Hello Again
#+NAME: code__helloagain
#+BEGIN_SRC emacs-lisp
(message "Hello again")
#+END_SRC

*Here we refer to code snippet [[code__hello]].*

will output below (lorem-ipsum added to increase page content so that the link jump is evident):

(message "Hello")

Here we refer to code snippet code__helloagain.

(message "Hello again")

Here we refer to code snippet code__hello.

Links to tables

Note: When targeting a #+NAME keyword, =#+CAPTION= keyword is mandatory in order to get proper numbering for source blocks, images and tables.
#+CAPTION: Simple table 1
#+NAME: tab__simple1
| a | b |
| c | d |

*Here we refer to table [[tab__simple2]].*

#+INCLUDE: "./all-posts.org::#lorem-ipsum" :only-contents t

Here's another table:

#+CAPTION: Simple table 2
#+NAME: tab__simple2
| e | f |
| g | h |

*Here we refer to table [[tab__simple1]].*

will output below (lorem-ipsum added to increase page content so that the link jump is evident):

ab
cd

Here we refer to table tab__simple2.

Here’s another table:

ef
gh

Here we refer to table tab__simple1.


Reference: (org) Images and tables.

Links to images

Note: When targeting a #+NAME keyword, =#+CAPTION= keyword is mandatory in order to get proper numbering for source blocks, images and tables.
#+CAPTION: Org Logo
#+NAME: img__org_logo1
[[/images/org-mode-unicorn-logo.png]]

*Here we refer to image [[img__org_logo2]].*

#+INCLUDE: "./all-posts.org::#lorem-ipsum" :only-contents t

Here's the same image again, but with a different Org link ID:

#+CAPTION: Same Org Logo
#+NAME: img__org_logo2
[[/images/org-mode-unicorn-logo.png]]

*Here we refer to image [[img__org_logo1]].*

will output below (lorem-ipsum added to increase page content so that the link jump is evident):

/images/org-mode-unicorn-logo.png

Here we refer to image img__org_logo2.

Here’s the same image again, but with a different Org link ID:

/images/org-mode-unicorn-logo.png

Here we refer to image img__org_logo1.


Reference: (org) Images and tables.

Equations

Inline equations

  • Inline equations are wrapped between \( and \).
    • $ wrapping also works, but it is not preferred as it comes with restrictions like “there should be no whitespace between the equation and the $ delimiters”.

      So $ a=b $ will not work (it will look like: $ a=b $), but $a=b$ will work (it will look like: $a=b$).

      On the other hand, both \(a=b\) (it will look like: \(a=b\)) and \( a=b \) (it will look like: \( a=b \)) will work.

  • One-per-line equations are wrapped between \[ and \] or $$ delimiters.

For example, below in Org:

LaTeX formatted equation: \( E = -J \sum_{i=1}^N s_i s_{i+1} \)

will look like this in Hugo rendered HTML:

LaTeX formatted equation: \( E = -J ∑i=1^N s_i si+1 \)

(Don’t see this in Markdown, see what it looks after Hugo has processed it.)

Here’s another example, taken from (org) LaTeX fragments.

Below in Org:

If $a^2=b$ and \( b=2 \), then the solution must be either
$$ a=+\sqrt{2} $$ or \[ a=-\sqrt{2} \]

renders to:

If $a^2=b$ and \( b=2 \), then the solution must be either $$ a=+\sqrt{2} $$ or \[ a=-\sqrt{2} \]

(Note that the last two equations show up on their own lines.)

Equations with (r), (c), ..

{{{oxhugoissue(104)}}}

Below, (r) or (R) should not get converted to &reg;, (c) or (C) should not get converted to &copy;, and (tm) or (TM) should not get converted to &trade;:

  • $(r)$ $(R)$
  • $(c)$ $(C)$
  • $(tm)$ $(TM)$
  • \( (r) \) \( (R) \)
  • \( (c) \) \( (C) \)
  • \( (tm) \) \( (TM) \)

Same as above but in Block Math equations:

$$ (r) (R) $$ $$ (c) (C) $$ $$ (tm) (TM) $$

\[ (r) (R) \] \[ (c) (C) \] \[ (tm) (TM) \]

Lists

List following a list

You need to force end of list when you have something like an unordered list immediately following an ordered list.

The easiest and cleanest way to do that is adding a comment between those lists.Reference

That would be the implementing in the Org exporter backend. But in Org, two consecutive blank lines after a list ends the list.

In the below example, the foo* items would be in a different <ul> element than the bar* items.

Unordered list following an unordered list

  • foo1
  • foo2
  • bar1
  • bar2

Unordered list following an ordered list

  1. foo3
  2. foo4
  • bar3
  • bar4

Ordered list following an unordered list

  • foo5
  • foo6
  1. bar5
  2. bar6

Description list following an ordered list

  • foo1
  • foo2
bar1
description
bar2
description

Nested lists

  • foo1
  • foo2
    • bar1
    • bar2
      • baz1
      • baz2
        • zoo1
        • zoo2
          1. numbered1
          2. numbered2

Force ordered list numbering

  1. This will be 1.
  2. This will be 2.
  1. [@10] This will be 10!
  2. This will be 11.
  1. [@17] This will be 17!
  2. This will be 18.
  3. [@123] This will be 123!
  4. This will be 124.
  1. This will be 1 again.
  2. This will be 2.

Another example:

  1. This will be 1.
  2. [@3] This will be 3!
  3. [@7] This will be 7!
  4. [@100] This will be 100!

See (org) Plain lists to read more about plain lists in Org.

Checklist

This is a check-list:

Checklist 1 [60%]

Checklist showing progress in percentage.

  • [ ] Task 1
  • [X] Task 2
  • [X] Task 3
  • [ ] Task 4
  • [X] Task 5

Checklist 2 [2/5]

Checklist showing progress in ratio.

  • [ ] Task 1
  • [ ] Task 2
  • [X] Task 3
  • [ ] Task 4
  • [X] Task 5

Quotes

Consecutive quotes

Some text.

Quote 1. This is a long quote that auto-fills into multiple lines in Org, but it will be a single paragraph in the exported format.

Quote 2. This is a short quote.

Quote 3. This is a multi-paragraph quote.

This is the second paragraph.

Some other text.

Example block inside quote block

Some text.

Some quoted text.

(some-example)
  

Some other text.

Multiple example blocks inside quote block

Some text.

Some quoted text.

(some-example)
  
(some-other-example)
  

Some other text.

Source block inside quote block, followed by another source block outside

Blackfriday Issue # 407

Some text.

Some quoted text.

(message "hello")
  
(message "hello again")

Some other text.

Example blocks inside quote block, followed by another example block outside

Blackfriday Issue # 407

Some text.

Some quoted text.

(some-example)
  
(some-other-example)
  
(yet-another-example)

Some other text.

Source block, followed by a quote block containing another source block

Some text.
(message "hello")

Some quoted text.

(message "hello again")
  

Some other text.

Example block with escaped Org syntax inside quote block

Some text.

Some quoted text.

#+NAME: some_example
(some-example)
  

Some other text.

Verse

One verse

To preserve the line breaks, indentation and blank lines in a region, but otherwise use normal formatting, you can use the verse construct, which can also be used to format poetry – Reference.

Consecutive verses

Verse for indentation

Some text before indented text.

Org removes indentation from the first line of the text block even in a Verse block. To get around that, the trick is to use the > character before the required indentation spaces only on the first non-blank line in a Verse block. Only that first > character is removed when translating to Markdown.

More examples

  • More indentation than in the above example:
  • Leading blank line followed by indented text:
  • Indented text followed by a trailing blank line:
  • Using tab characters for indentation; each tab character still constitutes for one &nbsp; in HTML.

Corner cases

Only the first > character immediately following spaces and empty lines will be removed:

If someone really wants to have > as the first non-blank character in the final output, they can use >> instead.. only for that first instance. The below Verse block is same as above except that the first > is retained in the final output.

Org TODO keywords

Post with a TODO heading

Heading 1

Some text.

Heading 2

Some text.

Post with a DONE heading

Heading 1

Some text.

Heading 2

Some text.

Blackfriday Options

Fractions

Fraction Table

/1/2/3/4/5/6/7/8/9/10/11/12/13
1/11/31/51/61/71/81/91/101/111/121/13
2/12/22/32/42/52/62/72/82/92/102/112/122/13
3/13/23/33/53/63/73/83/93/103/113/123/13
4/14/24/34/44/54/64/74/84/94/104/114/124/13
5/15/25/35/45/55/65/75/85/95/105/115/125/13
6/16/26/36/46/56/66/76/86/96/106/116/126/13
7/17/27/37/47/57/67/77/87/97/107/117/127/13
8/18/28/38/48/58/68/78/88/98/108/118/128/13
9/19/29/39/49/59/69/79/89/99/109/119/129/13
10/110/210/310/410/510/610/710/810/910/1010/1110/1210/13
11/111/211/311/411/511/611/711/811/911/1011/1111/1211/13
12/112/212/312/412/512/612/712/812/912/1012/1112/1212/13
13/113/213/313/413/513/613/713/813/913/1013/1113/1213/13

Blackfriday fractions false

A Blackfriday option can be disabled by setting the option value to nothing, nil or false.

These will not be rendered as fractions:

But these will always be rendered as fractions, even when the Blackfriday fractions option is set to false like in this post.

  • 1/2, 1/4, 3/4

Blackfriday fractions true

A Blackfriday option can be enabled by setting the option value to t or true.

All of these will be rendered as fractions:

Below are special as they will always be rendered as fractions, even when the Blackfriday fractions option is set to false (though this post has that option set to true – which is also the default value).

  • 1/2, 1/4, 3/4

Extensions

Hard line break wrong case (TOML)

The Blackfriday hardLineBreak extension is enabled here even where user used the wrong case in the extension name:
:EXPORT_HUGO_BLACKFRIDAY: :extensions hardlinebreak

instead of:

:EXPORT_HUGO_BLACKFRIDAY: :extensions hardLineBreak

The Blackfriday extension names are case-sensitive. So even though, the wrong case is used in the Org property drawer, ox-hugo ensures that the Markdown front matter is written in the correct case!

a b c

Above, a, b and c must appear on separate lines.

Hard line break (TOML)

a b c

Above, a, b and c must appear on separate lines.

Hard line break (YAML)

a b c

Above, a, b and c must appear on separate lines.

Enabling/Disabling extensions

:EXPORT_HUGO_BLACKFRIDAY+: :angledquotes t :hrefTargetBlank true :EXPORT_HUGO_BLACKFRIDAY+: :extensions tabsizeeight hardlinebreak :EXPORT_HUGO_BLACKFRIDAY+: :extensionsmask fencedcode strikethrough

Enabling/Disabling extensions example

Extensions enabled
tabSizeEight, hardLineBreak
Extensions disabled
fencedCode, strikethrough
Angled quotes enabled

“this”

Hard line break enabled

a b c

Plain ID Anchors disabled

Check the ID for all the headings in this post’s HTML. The ID’s will look something like:

<h2 id="plain-id-anchors-disabled:c94b2acd735ed6a466ef85be48bdea8c">Plain ID Anchors disabled</h2>

where :c94b2acd735ed6a466ef85be48bdea8c is the document ID.

Fractions disabled

2/5

Smart dashes disabled

a–b c–d

Fenced code disabled

Below, the code block language name will show up before the code.

(message "Hello")
Strikethrough disabled

not-canceled

Enabling/Disabling extensions (TOML)

Enabling/Disabling extensions (YAML)

Post Weight (Not the menu item weight)

Auto post-weight calculation

Post with auto weight calc 1 (EXPORT_HUGO_WEIGHT as subtree property)

Post with auto weight calc 2 (EXPORT_HUGO_WEIGHT as subtree property)

Post with auto weight calc 3 (EXPORT_HUGO_WEIGHT as subtree property)

Post with auto weight calc 4 (EXPORT_HUGO_WEIGHT as subtree property)

Post with auto weight calc 5 (EXPORT_HUGO_WEIGHT as subtree property)

Manually specified post weights

Post with weight 123

Post with weight 4567

Parsing date from CLOSED property

The “CLOSED” state of this heading (which is nil) should be ignored

When an Org TODO item is switched to the DONE state, a CLOSED property is auto-inserted (default behavior).

If such a property is non-nil, the value (time-stamp) of that is used to set the date field in the exported front-matter.

Reference
(org) Special properties or C-h i g (org) Special properties

Date Formats

Just date

Date + Time

Date + Time (UTC)

Date + Time (behind UTC)

Date + Time (after UTC)

Invalid Date

It’s possible that someone is using an existing Org file to export to Hugo. Some exporters like ox-texinfo recognize dates of style YEAR1-YEAR2 to use them in Copyright headers.

But that date is invalid as per the standard date format used by Hugo in date front-matter, and also as per org-parse-time-string.

So in that case, don’t allow org-parse-time-string to throw an error and abort the export, but instead simply don’t set the date in the front-matter.

In this post the :EXPORT_DATE: property is set to 2012-2017, but the export will still happen fine, with the date front-matter not set.

Preserve filling option

Filling is preserved

abc def ghi

Filling is not preserved

abc def ghi

Section Inheritance

Section A

Post A1

This post should be created in content/section-a/.

Category X

Post AX

This post should also be created in content/section-a/.

Keywords

TOC

Post with TOC using keyword set to 0

Post with TOC using keyword set to 2

Post with TOC using keyword set to 6

Export Options

Table of Contents (TOC)

ox-hugo has the with-toc option disabled by default as Hugo has an inbuilt TOC generation feature.

Still some people might prefer to use the Org generated TOC.

Section Numbers

Don’t number headlines or TOC

Don’t number headlines (but yes in TOC)

Number 0 levels

Number 2 levels

Number all levels

TOC

num set to nil

TOC with all headings (unnumbered)
TOC with headings (unnumbered) only till level 2
No TOC as toc set to nil

num set to t

TOC with all headings (numbered, except for selected unnumbered)
TOC with headings (numbered, except for selected unnumbered) only till level 2
No TOC as toc set to 0

num set to onlytoc

TOC with all headings (post-unnumbered, TOC-numbered)
TOC with headings (post-unnumbered, TOC-numbered) only till level 2

No TOC in Summary

By default, Hugo will dump everything at the beginning of a post into its .Summary (See Hugo content summaries). As TOC enabled using the export option like toc:t is inserted at the beginning of a post, TOC will be shown in that summary too!
ox-hugo’s Solution

ox-hugo helps prevent that with a workaround.. it inserts a special HTML comment =<!–endtoc–>= after the TOC.

It is important to insert a user-defined summary split by using #+HUGO: more. Otherwise it is very likely that the TOC is big enough to exceed the Hugo-defined max-summary length and so the <!--endtoc--> that appears after the TOC never gets parsed.

Always use =#+HUGO: more= when you enable Org generated TOC’s.

In your site’s Hugo template, you can then filter that out with something like:

Snippet
{{ $summary_splits := split .Summary "<!--endtoc-->" }}
{{ if eq (len $summary_splits) 2 }}
    <!-- If that endtoc special comment is present, output only the part after that comment as Summary. -->
    {{ index $summary_splits 1 | safeHTML }}
{{ else }}
    <!-- Print the whole Summary if endtoc special comment is not found. -->
    {{ .Summary }}
{{ end }}
Example

See this test site’s =summary.html= as an example.

Sub/superscripts

Sub/superscripts require braces

By default, ox-hugo implements the ^:{} export option. See C-h v org-export-with-sub-superscripts for details. With this option, the text that needs to be subscripted or superscripted has to be surrounded by braces {..} following the _ or ^.

Following text will export _ and ^ verbatim

a_b a_bc a^b a^bc

Following text will export _{..} as subscript and ^{..} as superscript

ab abc ab abc

Sub/superscripts don’t require braces

Following text will export _.. as subscript and ^.. as superscript

a_b a_bc a^b a^bc

Following text will export _{..} as subscript and ^{..} as superscript

ab abc ab abc

Disable exporting title

This post will be exported without title in the front-matter because it is disabled using :EXPORT_OPTIONS: title:nil.

Disable exporting author

This post will be exported without author in the front-matter because it is disabled using :EXPORT_OPTIONS: author:nil.

Creator

Default Creator

The front-matter for this post contains the default Creator string.

Custom Creator

The front-matter for this post contains a custom Creator string.

Export snippets and blocks

Export snippet

Export snippet Hugo

@@hugo:This will get exported **only for** Hugo exports, `verbatim`.@@

@@hugo:Newlines in Org source between the `@​@` pairs are allowed too (but no blank lines).@@

Use Export Blocks if you need blank lines in-between.

Export snippet Markdown

@@md:This Markdown **Export Snippet** will also get exported for Hugo exports, `verbatim`.@@

@@markdown:_This one too_@@

NOTE
ox-md.el does not support Export Snippets as of writing this <2017-12-08 Fri>. So even the @@md:foo@@ and @@markdown:foo@@ snippets are handled by ox-hugo directly.

Export snippet HTML

@@html:This HTML <b>Export Snippet</b> will also get exported for Hugo exports, <code>verbatim</code>.@@

Export block

Export block Hugo

Export block Markdown

Export block HTML

Miscellaneous Front Matter

Hugo Aliases

Alias without section portion 1

As the specified alias does not contain the “/” string, it will be auto-prefixed with the section for the current post.

New section just for test

Alias without section portion 2

As the specified alias does not contain the “/” string, it will be auto-prefixed with the section for the current post.

Alias specifying a different section

Alias specifying root section

Multiple aliases without section portion

Multiple aliases with section portion

Author

Single author

This post has 1 author.

Multiple authors

This post has multiple authors. In Org, multiple authors are added comma-separated.

Real Examples

Multifractals in ecology using R

:EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :source https://github.com/lsaravia/MultifractalsInR/blob/master/Curso3.md
Disclaimer
This post is from the link posted by GitHub user *lsaravia* in this comment. All credit for this post goes to the original author.

/images/MultifractalsInR/fractal-ice.jpg

Multifractals

  • Many natural systems cannot be characterized by a single number such as the fractal dimension. Instead an infinite spectrum of dimensions must be introduced. /images/MultifractalsInR/C3_Clouds.png

Multifractal definition

  • Consider a given object $Ω$, its multifractal nature is practically determined by covering the system with a set of boxes $\{B_i(r)\}$ with $(i=1,…, N(r))$ of side length $r$
  • These boxes are nonoverlaping and such that

    $$Ω = \bigcupi=1N(r) B_i(r)$$

    This is the box-counting method but now a measure $μ(B_n)$ for each box is computed. This measure corresponds to the total population or biomass contained in $B_n$, in general will scale as:

    $$μ(B_n) \propto r^α$$

Box counting

/images/MultifractalsInR/C3_BoxCounting.png

The generalized dimensions

  • The fractal dimension $D$ already defined is actually one of an infinite spectrum of so-called correlation dimension of order $q$ or also called Renyi entropies.

    $$D_q = limr → 0 \frac{1}{q-1}\frac{log \left[ ∑i=1N(r)p_i^q \right]}{log r}$$

    where $p_i=μ(B_i)$ and a normalization is assumed:

    $$∑i=1N(r)p_i=1$$

  • For $q=0$ we have the familiar definition of fractal dimension. To see this we replace $q=0$

    $$D_0 = -limr → 0\frac{N(r)}{log r}$$

Generalized dimensions 1

  • It can be shown that the inequality $D_q’ \leq D_q$ holds for $q’ \geq q$
  • The sum

    $$M_q(r) = ∑i=1N(r)[μ(B_i(r))]^q = ∑i=1N(r)p_i^q$$

    is the so-called moment or partition function of order $q$.

  • Varying q allows to measure the non-homogeneity of the pattern. The moments with larger $q$ will be dominated by the densest boxes. For $q&lt;0$ will come from small $p_i$’s.
  • Alternatively we can think that for $q&gt;0$, $D_q$ reflects the scaling of the large fluctuations and strong singularities. In contrast, for $q&lt;0$, $D_q$ reflects the scaling of the small fluctuations and weak singularities.

Exercise

  • Calculate the partition function for the center and lower images of the figure: /images/MultifractalsInR/C3_BoxCounting.png

Two important dimensions

  • Two particular cases are $q=1$ and $q=2$. The dimension for $q=1$ is the Shannon entropy or also called by ecologist the Shannon’s index of diversity.

    $$D_1 = -limr → 0i=1N(r) p_i log p_i$$

    and the second is the so-called correlation dimension:

    $$D_2 = -limr → 0 \frac{log \left[ ∑i=1N(r) p_i^2 \right]}{log r} $$

    the numerator is the log of the Simpson index.

Application

  • Salinity stress in the cladoceran Daphniopsis Australis. Behavioral experiments were conducted on individual males, and their successive displacements analyzed using the generalized dimension function $D_q$ and the mass exponent function $τ_q$ /images/MultifractalsInR/C3_Cladoceran.png both functions indicate that the successive displacements of male D. australis have weaker multifractal properties. This is consistent with and generalizes previous results showing a decrease in the complexity of behavioral sequences under stressful conditions for a range of organisms.
  • A shift between multifractal and fractal properties or a change in multifractal properties, in animal behavior is then suggested as a potential diagnostic tool to assess animal stress levels and health.

Mass exponent and Hurst exponent

  • The same information contained in the generalized dimensions can be expressed using mass exponents:

    $$M_q(r) \propto r-τ_q$$

    This is the scaling of the partition function. For monofractals $τ_q$ is linear and related to the Hurst exponent:

    $$τ_q = q H - 1$$

    For multifractals we have

    $$τ_q = (q -1) D_q$$

    Note that for $q=0$, $D_q = τ_q$ and for $q=1$, $τ_q=0$

Paper

  1. Kellner JR, Asner GP (2009) Convergent structural responses of tropical forests to diverse disturbance regimes. Ecology Letters 12: 887–897. doi:10.1111/j.1461-0248.2009.01345.x.

神经网络基础

:EXPORT_HUGO_CUSTOM_FRONT_MATTER+: :source https://github.com/Vonng/Math/blob/master/nndl/nn-intro.md
Disclaimer
This post is from the link posted by GitHub user *Vonng* in this comment. All credit for this post goes to the original author.

神经网络相关基本知识笔记

神经网络表示

神经元模型

神经网络从大脑的工作原理得到启发,可用于解决通用的学习问题。神经网络的 基本组成单元是 神经元(neuron) 。每个神经元具有一个轴突和多个树突。每 个连接到本神经元的树突都是一个输入,当所有输入树突的兴奋水平之和超过某 一阈值,神经元就会被激活。激活的神经元会沿着其轴突发射信号,轴突分出数 以万计的树突连接至其他神经元,并将本神经元的输出并作为其他神经元的输入。 数学上,神经元可以用 感知机 的模型表示。

/images/Vonng/neuron.png

一个神经元的数学模型主要包括以下内容:

名称 符号 说明
输入 (input) $x$ 列向量
权值 (weight) $w$ 行向量,维度等于输入个数
偏置 (bias) $b$ 标量值,是阈值的相反数
带权输入 (weighted input) $z$ $z=w · x + b$ ,激活函数的输入值
激活函数 (activation function) $σ$ 接受带权输入,给出激活值。
激活值 (activation) $a$ 标量值,$a = σ(\vec{w}·\vec{x}+b)$
激活函数表达式

$$ a = σ( \left[ \begin{matrix} w1 & ⋯ & wn \end{matrix}\right] · \left[ \begin{array}{x} x_1 \ ⋮ \ ⋮ \ x_n \end{array}\right] + b ) $$

激活函数通常使用S型函数,又称为sigmoid或者logsig,因为该函数具有良好的 特性: 光滑可微 ,形状接近感知机所使用的硬极限传输函数,函数值与 导 数值计算方便

$$ σ(z) = \frac 1 {1+e-z} $$

$$ σ’(z) = σ(z)(1-σ(z)) $$

也有一些其他的激活函数,例如:硬极限传输函数(hardlim),对称硬极限函数(hardlims), 线性函数(purelin), 对称饱和线性函数(satlins) ,对数-s形函数(logsig),正线性函数 (poslin),双曲正切S形函数(tansig),竞争函数(compet),有时候为了学习速度或者其他 原因也会使用,表过不提。

单层神经网络模型

可以并行操作的神经元组成的集合,称为神经网络的一层。

现在考虑一个具有 $n$ 个输入, $s$ 个神经元(输出)的单层神经网络,则原来单个神经元的数 学模型可扩展如下:

名称 符号 说明
输入 $x$ 同层所有神经元共用输入,故输入保持不变,仍为 $(n×1)$ 列向量
权值 $W$ $1 × n$ 行向量,变为 $s × n$ 矩阵,每一行表示一个神经元的权值信息
偏$置 $b$ $1 × 1$ 标量变为 $s × 1$ 列向量
带权输入 $z$ $1 × 1$ 标量变为 $s × 1$ 列向量
激活值 $a$ $1 × 1$ 标量变为 $s × 1$ 列向量
激活函数向量表达式

$$ \left[ \begin{array}{a} a_1 \ ⋮ \ a_s \end{array}\right] = σ( \left[ \begin{matrix} w1,1 & ⋯ & w1,n ⋮ & ⋱ & ⋮ \\ ws,1 & ⋯ & ws,n \\ \end{matrix}\right] · \left[ \begin{array}{x} x_1 \ ⋮ \ ⋮ \ x_n \end{array}\right] + \left[ \begin{array}{b} b_1 \ ⋮ \ b_s \end{array}\right] ) $$

单层神经网络能力有限,通常都会将多个单层神经网络的输出和输入相连,组成 多层神经网络。

多层神经网络模型

  • 多层神经网络的层数从1开始计数,第一层为 输入层 ,第 $L$ 层为 输出 层 ,其它的层称为 隐含层
  • 每一层神经网络都有自己的参数 $W,b,z,a,⋯$ ,为了区别,使用上标区分: $W^2,W^3,⋯$
  • 整个多层网络的输入,即为输入层的激活值 $x=a^1$ ,整个网络的输出,即 为输出层的激活值: $y’=a^L$
  • 因为输入层没有神经元,所以该层所有参数中只有激活值 $a^1$ 作为网络输入 值而存在,没有 $W^1,b^1,z^1$ 等。

现在考虑一个 $L​$ 层的神经网络,其各层神经元个数依次为: $d_1,d_2,⋯,d_L​$ 。则该网络的数学模型可扩展如下:

名称 符号 说明
输入 $x$ 输入仍然保持不变,为 $(d_1×1)$ 列向量
权值 $W$ $s × n$ 矩阵扩展为 $L-1$ 个矩阵组成的列表: $W^2d_2 × d_1,⋯,W^Ld_L × d_{L-1}$
偏置 $b$ $s × 1$ 列向量扩展为 $L-1$ 个列向量组成的列表: $b^2d_2,⋯,b^Ld_L$
带权输入 $z$ $s × 1$ 列向量扩展为 $L-1$ 个列向量组成的列表: $z^2d_2,⋯,z^Ld_L$
激活值 $a$ $s × 1$ 列向量扩展为 $L$ 个列向量 组成的列表: $a^1d_1,a^2d_2,⋯,a^Ld_L$
激活函数矩阵表达式

$$ \left[ \begin{array}{a} a^l_1 \ ⋮ \ a^ld_l \end{array}\right] = σ( \left[ \begin{matrix} w^l1,1 & ⋯ & w^l1,d_{l-1} ⋮ & ⋱ & ⋮ \\ w^ld_l,1 & ⋯ & w^ld_l,d_{l-1} \\ \end{matrix}\right] · \left[ \begin{array}{x} al-1_1 \ ⋮ \ ⋮ \ al-1d_{l-1} \end{array}\right] + \left[ \begin{array}{b} b^l_1 \ ⋮ \ b^ld_l \end{array}\right]) $$

权值矩阵的涵义

多层神经网络的权值由一系列权值矩阵表示

  • $l$ 层网络的权值矩阵可记作 $W^l$ ,表示前一层( $l-1$ )到本层 ($l$)的连接权重
  • $W^l$ 的第 $j$ 行可记作 $W^lj*$ ,表示从 $l-1$ 层所有 $dl-1$ 个神经元出发,到达 $l$$j$ 号神经元的连接权重
  • $W^l$ 的第 $k$ 列可记作 $W^l*k$ ,表示从 $l-1$ 层第 $k$ 号神经元出发,到达 $l$ 层所有 $d_l$ 个神经元的连接权重
  • $W^l$$j$$k$ 列可记作 $W^ljk$ ,表示从 $l-1$$k$ 号神经元 出发,到达 $l$$j$ 神经元的连接权重
  • 如图, $w^324$ 表示从2层4号神经元到3层2号神经元的连接权值:

/images/Vonng/nn-weight.png

只要记住,权值矩阵 $W$行标表示本层神经元 的标号, 列标表示上层 神 经元 的标号即可。

神经网络推断

前馈(feed forward) 是指神经网络接受输入,产生输出的一次计算过程。又 称为一次 推断(inference)

计算过程如下:

\begin{align} a^1 &= x
a^2 &= σ(W^2a^1 + b^2) \ a^3 &= σ(W^3a^2 + b^3) \ ⋯ \ a^L &= σ(W^LaL-1 + b^L) \ y &= a^L \ \end{align}

推断实际上就是一系列矩阵乘法与向量运算,一个训练好的神经网络可以高效地 使用各种语言实现。神经网络的功能是通过推断而体现的。推断实现起来很简单, 但如何 训练神经网络 才是真正的难点。

神经网络训练

神经网络的训练,是调整网络中的权值参数与偏置参数,从而提高网络工作效果 的过程。

通常使用 梯度下降(Gradient Descent) 的方法来调整神经网络的参数,首先 要定义一个 代价函数(cost function) 用以衡量神经网络的误差,然后通过 梯度下降方法计算合适的参数修正量,从而 最小化 网络误差。

代价函数

代价函数是用于衡量神经网络工作效果的函数,是定义在一个或多个样本上的实 值函数,通常应满足以下条件:

  1. 误差是非负的,神经网络效果越好,误差越小
  2. 代价可以写成神经网络输出的函数
  3. 总体代价等于个体样本代价的均值$C=\frac{1}{n} ∑_x C_x$

最常用的一个简单的代价函数是: 二次代价函数 ,又称为 均方误差 (MeanSquareError)

$$ C(w,b) = \frac{1}{2n} ∑_x{{\|y(x)-a\|}^2} $$

前面的系数 $\frac 1 2$ 是为了求导后简洁的形式而添加的, $n$ 是使用样本 的数量,这里 $y$$x$ 都是已知的样本数据。

理论上任何可以反映网络工作效果的指标都可以作为代价函数。但之所以使用 MSE,而不是诸如”正确分类图像个数”的指标,是因为只有一个 光滑可导 的 代价函数才可以使用 梯度下降 (Gradient Descent)调整参数。

样本的使用

代价函数的计算需要一个或多个训练样本。当训练样本非常多时,如果每轮训练 都要重新计算网络整个训练集上所有样本的误差函数,开销非常大,速度难以接 受。若只使用总体的一小部分,计算就能快很多。不过这样做依赖一个假设: 随机样本的代价,近似等于总体的代价。

按照使用样本的方式,梯度下降又分为:

  • 批量梯度下降法(Batch GD):最原始的形式,更新每一参数都使用所有样本。可以得到全 局最优解,易于并行实现,但当样本数量很多时,训练速度极慢。
  • 随机梯度下降法(Stochastic GD):解决BGD训练慢的问题,每次随机使用一个样本。训练 速度快,但准确度下降,且并不是全局最优,也不易于并行实现。
  • 小批量梯度下降法(MiniBatch GD):在每次更新参数时使用b个样本(例如每次10个样本), 在BGD与SGD中取得折中。

每次只使用一个样本时,又称为在线学习或递增学习。

当训练集的所有样本都被使用过一轮,称为完成一轮 迭代

梯度下降算法

若希望通过调整神经网络中的某个参数来减小整体代价,则可以考虑微分的方法。 因为每层的激活函数,以及最终的代价函数都是光滑可导的。所以最终的代价函 数 $C$ 对于某个我们感兴趣的参数 $w,b$ 也是光滑可导的。轻微拨动某个参数 的值,最终的误差值也会发生连续的轻微的变化。不断地沿着参数的梯度方向, 轻微调整每个参数的值,使得总误差值向下降的方向前进,最终达到极值点。就 是梯度下降法的核心思想。

梯度下降的逻辑

现在假设代价函数 $C$ 为两个变量 $v_1,v_2$ 的可微函数,梯度下降实际上就是 选择合适的 $Δv$ ,使得 $ΔC$ 为负。由微积分可知:

$$ ΔC ≈ \frac{∂C}{∂v_1} Δv_1 + \frac{∂C}{∂v_2} Δv_2 $$

这里 $Δv$ 是向量: $Δv = \left[ \begin{array}{v} Δv_1 \ Δv_2 \end{array}\right]$ , $∇C$ 是梯度向量 $\left[ \begin{array}{C} \frac{∂C}{∂v_1} \ \frac{∂C}{∂v_2} \end{array} \right]$ ,于是上式可重 写为

$$ ΔC ≈ ∇C·Δv $$

怎样的 $Δv$ 才能令代价函数的变化量为负呢?一种简单办法是令即 $Δv$ 取一 个与梯度 $∇C$ 共线反向的小向量,此时 $Δv = -η∇C$ ,则损失函数变化量 $ΔC ≈ -η{∇C}^2$ ,可以确保为负值。按照这种方法,通过不断调整 $v$$v → v’ = v -η∇C$ ,使得 $C$ 最终达到极小值点。

这即梯度下降的涵义所在: 所有参数都会沿着自己的梯度(导数)方向不断进行 轻微下降, 使得总误差到达极值点。

对于神经网络,学习的参数实际上是权重 $w$ 与偏置量 $b$ 。原理是一样的,不过这里的 $w,b$ 数目非常巨大

$$ w →w’ = w-η\frac{∂C}{∂w} b → b’ = b-η\frac{∂C}{∂b} $$

真正棘手的问题在于梯度 $∇C_w,∇C_b$ 的计算方式。如果使用微分的方法,通 过 $\frac {C(p+ε)-C} {ε}$ 来求参数的梯度,那么网络中的每一个参数都需要 进行一次前馈和一次 $C(p+ε)$ 的计算,在神经网络汪洋大海般的参数面前,这 样的办法是行不通的。

反向传播(Back propagation)算法 可以解决这一问题。通过巧妙的简化,可 以在一次前馈与一次反传中,高效地计算整个网络中所有参数梯度。

反向传播

反向传播算法接受一个打标样本 $(x,y)$ 作为输入,给出网络中所有参数 $(W,b)$ 的梯度。

反向传播误差δ

反向传播算法需要引入一个新的概念:误差 $δ$ 。误差的定义源于这样一种朴素的思想:如 果轻微修改某个神经元的带权输入 $z$ ,而最终代价 $C$ 已不再变化,则可认为 $z$ 已经到达 极值点,调整的很好了。于是损失函数 $C$ 对某神经元带权输入 $z$ 的偏导 $\frac {∂C}{∂z}$ 可以作为该神经元上误差 $δ$ 的度量。故定义第 $l$ 层的第 $jth$ 个神经元上的误差 $δ^l_j$ 为:

$$ δ^l_j ≡ \frac{∂C}{∂z^l_j} $$

与激活值 $a$ , 带权输入 $z$ 一 样,误差也可以写作向量。第 $l$ 层 的误差向量记作 $δ^l$ 。 虽 然看上去差不多,但之所以使用带权输入 $z$ 而 不是激活值输出 $a$ 来 定义本层的误差,有着 形式上巧妙的设计。

引入反向传播误差的概念,是为了通过误差向量来计算梯度 $∇C_w,∇C_b$

反向传播算法一言蔽之:计算出 输出层误差 ,通过递推方程逐层回算出 每 一层的误差 ,再由每一层的误差算出 本层的权值梯度与偏置梯度

这需要解决四个问题:

  1. 递推首项:如何计算输出层的误差: $δ^L$
  2. 递推方程:如何根据后一层的误差 $δl+1$ 计 算前一层误差 $δ^l$
  3. 权值梯度:如何根据本层误差 $δ^l$ 计 算本层权值梯度 $∇W^l$
  4. 偏置梯度:如何根据本层误差 $δ^l$ 计 算本层偏置梯度 $∇b^l$

这四个问题,可以通过四个反向传播方程得到解决。

反向传播方程

方程 说明 编号
$δ^L = ∇C_a ⊙ σ’(z^L)$ 输出层误差计算公式 BP1
$δ^l = (Wl+1)^T δl+1 ⊙ σ’(z^l)$ 误差传递公式 BP2
$∇CW^l = δ^l × {(a{l-1)}}T$ 权值梯度计算公式 BP3
$∇C_b = δ^l$ 偏置梯度计算公式 BP4

当误差函数取MSE: $C = \frac 1 2 \|\vec{y} -\vec{a}\|^2= \frac 1 2 [(y_1 - a_1)^2 + \cdots + (yd_L - ad_L)^2]$ ,激活函数取sigmoid时:

计算方程 说明 编号
$δ^L = (a^L - y) ⊙(1-a^L)⊙ a^L$ 输出层误差需要 $a^L$$y$ BP1
$δ^l = (W{l+1)}T δl+1 ⊙(1-a^l)⊙ a^l$ 本层误差需要:后层权值 $Wl+1$ , 后层误差 $δl+1$ , 本层输出$a^l$ BP2
$∇CW^l = δ^l × {(a{l-1)}}T$ 权值梯度需要:本层误差 $δ^l$ , 前层输出$al-1$ BP3
$∇C_b = δ^l$ 偏置梯度需要:本层误差$δ^l$ BP4

反向传播方程的证明

BP1:输出层误差方程

输出层误差方程给出了根据网络输出 $a^L$ 与 标记结果 $y$ 计 算输出层误差 $δ$ 的 方法:

$$ δ^L = (a^L - y) ⊙(1-a^L)⊙ a^L $$

证明

因为 $a^L = σ(z^L)$ ,本方程可以直接从反向传播误差的定义,通过 $a^L$ 作为中间变量链 式求导 推导得出:

$$ \frac{∂C}{∂z^L} = \frac{∂C}{∂a^L} \frac{∂a^L}{∂z^L} = ∇C_a σ’(z^L) $$

而因为误差函数 $C = \frac 1 2 \|\vec{y} -\vec{a}\|^2= \frac 1 2 [(y_1 - a_1)^2 + ⋯ + (yd_L - ad_L)^2]$ ,方程两侧对某个 $a_j$ 取偏导则 有:

$$ \frac {∂C}{∂a^L_j} = (a^L_j-y_j) $$

因为误差函数中,其他神经元的输出不会影响到误差函数对神经元 $j$ 输出的 偏导,系数也正好平掉了。写作向量形式即为: $(a^L - y)$ 。另一方面,易 证 $σ’(z^L) = (1-a^L)⊙ a^L$

QED

BP2:误差传递方程

误差传递方程给出了根据后一层误差计算前一层误差的方法:

$$ δ^l = (Wl+1)^T δl+1 ⊙ σ’(z^l) $$

证明

本方程可以直接从反向传播误差的定义,以后一层所有神经元的带权输入 $zl+1$ 作 为中间变 量进行链式求导推导出:

$$ δ^l_j = \frac {∂C}{∂z^l_j} = ∑k=1d_{l+1} \frac{∂C}{∂zl+1_k} \frac{∂zl+1_k}{∂zl_j} = ∑k=1d_{l+1} (δl+1_k \frac{∂zl+1_k}{∂zl_j}) $$

通过链式求导,引入后一层带权输入作为中间变量,从而在方程右侧引入后一层 误差的表达形式。现在要解决的就是 $\frac{∂zl+1_k}{∂zl_j}​$ 是 什么的问 题。由带权输入的定义 $z = wx + b​$ 可知:

$$ zl+1_k = Wl+1k,* ·a^l + bl+1k = Wl+1k,* · σ(zl) + bl+1k = ∑j=1d_{l}(wkjl+1 σ(zl_j)) + bl+1k $$

两边同时对 $zl_j$ 求 导可以得到:

$$ \frac{∂zl+1_k}{∂zl_j} = wl+1kj σ’(z^l) $$

回代则有:

\begin{align} δ^l_j & = ∑k=1d_{l+1} (δl+1_k \frac{∂zl+1_k}{∂zl_j})
& = σ’(z^l) ∑k=1d_{l+1} (δl+1_k wl+1kj) \ & = σ’(z^l) ⊙ [(δl+1) · Wl+1*.j] \ & = σ’(z^l) ⊙ [(Wl+1)^Tj,* · (δl+1) ]\ \end{align}

这里,对后一层所有神经元的误差权值之积求和,可以改写为两个向量的点积:

  • 后一层 $k$ 个 神经元的误差向量
  • 后一层权值矩阵的第 $j$ 列 ,即所有从本层 $j$ 神 经元出发前往下一层所有 $k$ 个 神经元的 权值。

又因为向量点积可以改写为矩阵乘法:以行向量乘以列向量的方式进行,所以将权值矩阵转 置,原来拿的是列,现在则拿出了行向量。这时候再改写回向量形式为:

$$ δ^l = σ’(z^l) ⊙ (Wl+1)^Tδl+1 $$

QED

BP3:权值梯度方程

每一层的权值梯度 $∇CW^l$ 可 以根据本层的误差向量(列向量),与上层的输出向量(行向 量)的外积得出。

$$ ∇CW^l = δ^l × {(al-1)}^T $$

证明

由误差的定义,以 $w^ljk$ 作 为中间变量求偏导可得:

\begin{align} δ^l_j & = \frac{∂C}{∂z^l_j} = \frac{∂C}{∂w^ljk} \frac{∂ wjk}{∂ z^l_j} = ∇Cw^l_{jk} \frac{∂wjk}{∂ z^l_j} \end{align}

由定义可得,第 $l$ 层 第 $j$ 个 神经元的带权输入 $z^l_j$

$$ z^l_j = ∑_k w^ljk al-1_k + b^l_j $$

两侧对 $wjk^l$ 求 导得到:

$$ \frac{∂ z_j}{∂ w^ljk} = al-1_k $$

代回则有: $$ ∇Cw^l_{jk} = δ^l_j \frac{∂ z^l_j}{∂wjk} = δ^l_j al-1_k $$ 观察可知,向量形式是一个外积: $$ ∇CW^l = δ^l × {(al-1)}^T $$

  • 本层误差行向量$δ^l$ ,维度为($d_l × 1$)
  • 上层激活列向量 : $(al-1)^T​$ ,维度为($1 × dl-1​$)

QED

BP4:偏置梯度方程

$$ ∇C_b = δ^l $$

证明

由定义可知:

$$ δ^l_j = \frac{∂C}{∂z^l_j} = \frac{∂C}{∂b^l_j} \frac{∂b_j}{∂z^l_j} = ∇Cb^l_{j} \frac{∂b_j}{∂z^l_j} $$

因为 $z^l_j = W^l*,j ⋅ al-1 + b^l_j$ ,两侧对 $z_j^l$ 求 导得到 $1=\frac{∂b_j}{∂z^l_j}$ 。 于是回代得到: $∇Cb^l_{j} =δ^l_j$ ,

QED

至此,四个方程均已证毕。只要将其转换为代码即可工作。

神经网络的实现

作为概念验证,这里给出了MNIST手写数字分类神经网络的Python实现。

# coding: utf-8
# author: vonng(fengruohang@outlook.com)
# ctime: 2017-05-10

import random
import numpy as np

class Network(object):
    def __init__(self, sizes):
        self.sizes = sizes
        self.L = len(sizes)
        self.layers = range(0, self.L - 1)
        self.w = [np.random.randn(y, x) for x, y in zip(sizes[:-1], sizes[1:])]
        self.b = [np.random.randn(x, 1) for x in sizes[1:]]

    def feed_forward(self, a):
        for l in self.layers:
            a = 1.0 / (1.0 + np.exp(-np.dot(self.w[l], a) - self.b[l]))
        return a

    def gradient_descent(self, train, test, epoches=30, m=10, eta=3.0):
        for round in range(epoches):
            # generate mini batch
            random.shuffle(train)
            for batch in [train_data[k:k + m] for k in xrange(0, len(train), m)]:
                x = np.array([item[0].reshape(784) for item in batch]).transpose()
                y = np.array([item[1].reshape(10) for item in batch]).transpose()
                n, r, a = len(batch), eta / len(batch), [x]

                # forward & save activations
                for l in self.layers:
                    a.append(1.0 / (np.exp(-np.dot(self.w[l], a[-1]) - self.b[l]) + 1))

                # back propagation
                d = (a[-1] - y) * a[-1] * (1 - a[-1])   #BP1
                for l in range(1, self.L):  # l is reverse index since last layer
                    if l > 1:   #BP2
                        d = np.dot(self.w[-l + 1].transpose(), d) * a[-l] * (1 - a[-l])
                    self.w[-l] -= r * np.dot(d, a[-l - 1].transpose()) #BP3
                    self.b[-l] -= r * np.sum(d, axis=1, keepdims=True) #BP4

            # evaluate
            acc_cnt = sum([np.argmax(self.feed_forward(x)) == y for x, y in test])
            print "Round {%d}: {%s}/{%d}" % (round, acc_cnt, len(test_data))


if __name__ == '__main__':
    import mnist_loader

    train_data, valid_data, test_data = mnist_loader.load_data_wrapper()
    net = Network([784, 100, 10])
    net.gradient_descent(train_data, test_data, epoches=100, m=10, eta=2.0)

数据加载脚本: =mnist_loader.py= 。输入数据为二元组列表: (input(784,1), output(10,1))

$ python net.py
Round {0}: {9136}/{10000}
Round {1}: {9265}/{10000}
Round {2}: {9327}/{10000}
Round {3}: {9387}/{10000}
Round {4}: {9418}/{10000}
Round {5}: {9470}/{10000}
Round {6}: {9469}/{10000}
Round {7}: {9484}/{10000}
Round {8}: {9509}/{10000}
Round {9}: {9539}/{10000}
Round {10}: {9526}/{10000}

一轮迭代后,网络在测试集上的分类准确率就达到90%,最终收敛至96%左右。

对于五十行代码,这个效果是值得惊叹的。然而96%的准确率在实际生产中恐怕仍然是无法 接受的。想要达到更好的效果,就需要对神经网络进行优化。

神经网络优化

神经网络的基础知识也就这么多,但优化其表现却是一个无尽的挑战。每一种优化的手段都 可以当做一个进阶的课题深入研究。优化手段也是八仙过海各显神通:有数学,有科学,有 工程学,也有哲学,还有玄学…

改进神经网络的学习效果有几种主要的方法:

  • 选取 更好的代价函数 :例如 交叉熵(cross-entropy)
  • 规范化(regularization)L2规范化 、弃权、L1规范化
  • 采用其他的 激活神经元 :线性修正神经元(ReLU),双曲正切神经元(tansig)
  • 修改神经网络的输出层: 柔性最大值(softmax)
  • 修改神经网络输入的组织方式:递归神经网络(Recurrent NN),卷积神经网络(Convolutional NN)。
  • 添加层数:深度神经网络(Deep NN)
  • 通过尝试,选择合适的 超参数(hyper-parameters) ,按照迭代轮数或评 估效果动态调整超参数。
  • 采用其他的梯度下降方法:基于动量的梯度下降
  • 使用更好的 初始化权重
  • 人为扩展已有训练数据集

这里介绍两种方法, 交叉熵代价函数L2规范化 。因为它们:

  • 实现简单,修改一行代码即可实现,还减小了计算开销。
  • 效果立竿见影,将分类错误率从4%降低到2%以下。

代价函数:交叉熵

MSE是一个不错的代价函数,然而它存在一个很尴尬的问题:学习速度。

MSE输出层误差的计算公式为: $$ δ^L = (a^L - y)σ’(z^L) $$

sigmoid又称为逻辑斯蒂曲线,其导数 $σ’$ 是 一个钟形曲线。所以当带权输入 $z$ 从 大到小或 从小到大时,梯度的变化会经历一个”小,大,小”的过程。学习的速度也会被导数项拖累, 存在一个”慢,快,慢”的过程。

MSECross Entropy
/images/Vonng/mse.png/images/Vonng/cross-entropy.png

若采用 交叉熵(cross entropy) 误差函数:

$$ C = - \frac 1 n ∑_x [ y ln(a) + (1-y)ln(1-a)] $$

对于单个样本,即

$$ C = - [ y ln(a) + (1-y)ln(1-a)] $$

虽然看起来很复杂,但输出层的误差公式变得异常简单,变为: $δ^L = a^L - y$

比起MSE少掉了导数因子,所以误差直接和(预测值-实际值)成正比,不会遇到学习速度被 激活函数的导数拖慢的问题,计算起来也更为简单。

证明

$C$ 对 网络输出值 $a$ 求 导,则有:

$$ ∇C_a = \frac {∂C} {∂a^L} = - [ \frac y a - \frac {(1-y)} {1-a}] = \frac {a - y} {a (1-a)} $$

反向传播的四个基本方程里,与误差函数 $C$ 相 关的只有BP1:即输出层误差的计算方式。

$$ δ^L = ∇C_a ⊙ σ’(z^L) $$

现在 $C$ 换 了计算方式,将新的误差函数 $C$ 对 输出值 $a^L$ 的 梯度 $\frac {∂C} {∂a^L}$ 带回BP1,即有:

$$ δ^L = \frac {a - y} {a (1-a)}× a(1-a) = a-y $$

规范化

拥有大量的自由参数的模型能够描述特别神奇的现象。

费米说:”With four parameters I can fit an elephant, and with five I can make him wiggle his trunk”。神经网络这种动辄百万的参数的模型能拟合出什么奇妙的东西是 难以想象的。

一个模型能够很好的拟合已有的数据,可能只是因为模型中足够的自由度,使得 它可以描述几乎所有给定大小的数据集,而不是真正洞察数据集背后的本质。发 生这种情形时, 模型对已有的数据表现的很好,但是对新的数据很难泛化 。 这种情况称为 过拟合(overfitting)

例如用3阶多项式拟合一个带随机噪声的正弦函数,看上去就还不错;而10阶多项式,虽然 完美拟合了数据集中的所有点,但实际预测能力就很离谱了。它拟合的更多地是数据集中的 噪声,而非数据集背后的潜在规律。

x, xs = np.linspace(0, 2 * np.pi, 10), np.arange(0, 2 * np.pi, 0.001)
y = np.sin(x) + np.random.randn(10) * 0.4
p1,p2 = np.polyfit(x, y, 10), np.polyfit(x, y, 3)
plt.plot(xs, np.polyval(p1, xs));plt.plot(x, y, 'ro');plt.plot(xs, np.sin(xs), 'r--')
plt.plot(xs, np.polyval(p2, xs));plt.plot(x, y, 'ro');plt.plot(xs, np.sin(xs), 'r--')
3阶多项式10阶多项式
/images/Vonng/overfit-3.png/images/Vonng/overfit-10.png

一个模型真正的测验标准,是它对没有见过的场景的预测能力,称为 泛化能力 (generalize)

如何避免过拟合?按照奥卡姆剃刀原理: 两个效果相同的解释,选择简单的那 一个。

当然这个原理只是我们抱有的一种信念,并不是真正的定理铁律:这些数据点真的由拟合出 的十阶多项式产生,也不能否认这种可能…

总之,如果出现非常大的权重参数,通常就意味着过拟合。例如拟合所得十阶多项式系数就 非常畸形:

-0.001278386964370502
0.02826407452052734
-0.20310716176300195
0.049178327509096835
7.376259706365357
-46.295365250182925
135.58265224859255
-211.767050023543
167.26204130954324
-50.95259728945658
0.4211227089756039

通过添加权重衰减项,可以有效遏制过拟合。例如 $L2$ 规 范化为损失函数添 加了一个 $\frac λ 2 w^2$ 的惩罚项:

$$ C = -\frac{1}{n} ∑xj \left[ y_j ln a^L_j+(1-y_j) ln (1-a^L_j)\right] + \frac{λ}{2n} ∑_w w^2 $$

所以,权重越大,损失值越大,这就避免神经网络了向拟合出畸形参数的方向发展。

这里使用的是交叉熵损失函数。但无论哪种损失函数,都可以写成:

$$ C = C_0 + \frac {λ}{2n} ∑_w {w^2} $$

其中原始的代价函数为 $C_0$ 。 那么,原来损失函数对权值的偏导,就可以写成:

$$ \frac{∂C}{∂w} = \frac{ ∂C_0}{∂w}+\frac{λ}{n} w $$

因此,引入 $L2$ 规 范化惩罚项在计算上的唯一变化,就是在处理权值梯度时首先要乘一个衰 减系数:

$$ w → w’ = w\left(1 - \frac{ηλ}{n} \right) - η\frac{∂C_0}{∂ w} $$

注意这里的 $n$ 是 所有的训练样本数,而不是一个小批次使用的训练样本数。

改进实现
# coding: utf-8
# author: vonng(fengruohang@outlook.com)
# ctime: 2017-05-10

import random
import numpy as np

class Network(object):
    def __init__(self, sizes):
        self.sizes = sizes
        self.L = len(sizes)
        self.layers = range(0, self.L - 1)
        self.w = [np.random.randn(y, x) / np.sqrt(x) for x, y in zip(sizes[:-1], sizes[1:])]
        self.b = [np.random.randn(x, 1) for x in sizes[1:]]

    def feed_forward(self, a):
        for l in self.layers:
            a = 1.0 / (1.0 + np.exp(-np.dot(self.w[l], a) - self.b[l]))
        return a

    def gradient_descent(self, train, test, epoches=30, m=10, eta=0.1, lmd=5.0):
        n = len(train)
        for round in range(epoches):
            random.shuffle(train)
            for batch in [train_data[k:k + m] for k in xrange(0, len(train), m)]:
                x = np.array([item[0].reshape(784) for item in batch]).transpose()
                y = np.array([item[1].reshape(10) for item in batch]).transpose()
                r = eta / len(batch)
                w = 1 - eta * lmd / n

                a = [x]
                for l in self.layers:
                    a.append(1.0 / (np.exp(-np.dot(self.w[l], a[-1]) - self.b[l]) + 1))

                d = (a[-1] - y)  # cross-entropy    BP1
                for l in range(1, self.L):
                    if l > 1:   # BP2
                        d = np.dot(self.w[-l + 1].transpose(), d) * a[-l] * (1 - a[-l])
                    self.w[-l] *= w  # weight decay
                    self.w[-l] -= r * np.dot(d, a[-l - 1].transpose())  # BP3
                    self.b[-l] -= r * np.sum(d, axis=1, keepdims=True)  # BP4

            acc_cnt = sum([np.argmax(self.feed_forward(x)) == y for x, y in test])
            print "Round {%d}: {%s}/{%d}" % (round, acc_cnt, len(test_data))


if __name__ == '__main__':
    import mnist_loader
    train_data, valid_data, test_data = mnist_loader.load_data_wrapper()
    net = Network([784, 100, 10])
    net.gradient_descent(train_data, test_data, epoches=50, m=10, eta=0.1, lmd=5.0)
Round {0}: {9348}/{10000}
Round {1}: {9538}/{10000}
Round {2}: {9589}/{10000}
Round {3}: {9667}/{10000}
Round {4}: {9651}/{10000}
Round {5}: {9676}/{10000}
...
Round {25}: {9801}/{10000}
Round {26}: {9799}/{10000}
Round {27}: {9806}/{10000}
Round {28}: {9804}/{10000}
Round {29}: {9804}/{10000}
Round {30}: {9802}/{10000}

可见只是简单的变更,就使准确率有了显著提高,最终收敛至98%。

修改Size为 [784,128,64,10] 添加一层隐藏层,可以进一步提升测试集准确 率至98.33%,验证集至98.24%。

对于MNIST数字分类任务,目前最好的准确率为99.79%,那些识别错误的case,恐怕人类想 要正确识别也很困难。神经网络的分类效果最新进展可以参看这里: classification\_datasets\_results

本文是tensorflow官方推荐教程:Neural Networks and Deep Learning的笔记整理,原文 Github Page

Pre-Draft State

If a post has the TODO keyword, the draft front matter variable should be set to true.

Idea to to mark a post or blog idea as TODO that you yet have to start writing.

Draft state

If a post has the DRAFT keyword too, the draft front matter variable should be set to true.

Idea is to mark a post as DRAFT that you have already started writing, or are in the process at the moment, but it is not yet ready to be published

Draft state with other headlines

The “TODO” state of this heading (which is nil) should be ignored

If a post has the DRAFT state set, the draft front matter variable should be set to true, even if the post has a sub-heading immediately after the post heading.

Reusable sections

Nested sections example

Post sub-heading 1

Post sub-heading 1.1

Post sub-heading 1.1.1

Post sub-heading 1.2

Post sub-heading 1.3

Post sub-heading 2

Post sub-heading 2.1

Post sub-heading 2.2

Post sub-heading 2.2.1
Post sub-heading 2.2.2
The UNNUMBERED property for this subtree is set to t. So this heading will show up as unnumbered in both the post body and the TOC.
Post sub-heading 2.2.3

Post sub-heading 3

Post sub-heading 3.1

Example text with code blocks

Here are few variables that you might like to change in the local.mk:
prefix
Org installation directory
prefix = /dir/where/you/want/to/install/org # Default: /usr/share
    

The .el files will go to $(prefix)/emacs/site-lisp/org by default. If you’d like to change that, you can tweak the lispdir variable.

infodir
Org Info installation directory. I like to keep the Info file for development version of Org in a separate directory.
infodir = $(prefix)/org/info # Default: $(prefix)/info
    
ORG_MAKE_DOC
Types of Org documentation you’d like to build by default.
# Define below you only need info documentation, the default includes html and pdf
ORG_MAKE_DOC = info pdf card # html
    
ORG_ADD_CONTRIB
Packages from the contrib/ directory that you’d like to build along with Org. Below are the ones on my must-have list.
# Define if you want to include some (or all) files from contrib/lisp
# just the filename please (no path prefix, no .el suffix), maybe with globbing
#   org-eldoc - Headline breadcrumb trail in minibuffer
#   ox-extra - Allow ignoring just the heading, but still export the body of those headings
#   org-mime - Convert org buffer to htmlized format for email
ORG_ADD_CONTRIB = org-eldoc ox-extra org-mime
    

Here’s an example of an emacs-lisp block:

(defvar emacs-version-short (format "%s_%s"
                                    emacs-major-version emacs-minor-version)
  "A variable to store the current emacs versions as <MAJORVER>_<MINORVER>.
So, for emacs version 25.0.50.1, this variable will be 25_0.")

Source block with line numbers examples

#+BEGIN_SRC emacs-lisp -n
;; this will export with line number 1 (default)
(message "This is line 2")
#+END_SRC
#+BEGIN_SRC emacs-lisp -n 20
;; this will export with line number 20
(message "This is line 21")
#+END_SRC
#+BEGIN_SRC emacs-lisp +n
;; This will be listed as line 22
(message "This is line 23")
#+END_SRC
#+BEGIN_SRC emacs-lisp +n 10
;; This will be listed as line 33
(message "This is line 34")
#+END_SRC

Source block with line highlighting examples

#+BEGIN_SRC emacs-lisp :hl_lines 1,3-5
(message "This is line 1")
(message "This is line 2")
(message "This is line 3")
(message "This is line 4")
(message "This is line 5")
(message "This is line 6")
#+END_SRC
#+BEGIN_SRC emacs-lisp -n 7 :hl_lines 1,3-5
(message "This is line 7 in code, but line 1 for highlighting reference")
(message "This is line 8 in code, but line 2 for highlighting reference")
(message "This is line 9 in code, but line 3 for highlighting reference")
(message "This is line 10 in code, but line 4 for highlighting reference")
(message "This is line 11 in code, but line 5 for highlighting reference")
(message "This is line 12 in code, but line 6 for highlighting reference")
#+END_SRC
#+BEGIN_SRC emacs-lisp -n :hl_lines 1,3-5
(message "This is line 1")
(message "This is line 2")
(message "This is line 3")
(message "This is line 4")
(message "This is line 5")
(message "This is line 6")
#+END_SRC

Lorem Ipsum

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Quisque et quam metus. Etiam in iaculis mi, sit amet pretium magna. Donec ut dui mi. Maecenas pharetra sapien nunc, ut mollis enim aliquam quis. Nam at ultricies metus. Nulla tempor augue in vestibulum tristique. Phasellus volutpat pharetra metus quis suscipit. Morbi maximus sem dolor, id accumsan ipsum commodo non.

Fusce quam ligula, gravida ac dui venenatis, bibendum commodo lorem. Duis id elit turpis. Integer sed diam nibh. Donec tempus lacinia odio, a laoreet velit dictum id. Suspendisse efficitur euismod purus et porttitor. Aliquam sit amet tellus mauris. Mauris semper dignissim nibh, faucibus vestibulum purus varius quis. Suspendisse potenti. Cras at ligula sit amet nunc vehicula condimentum quis nec est. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Donec iaculis, neque sit amet maximus rhoncus, nisl tortor convallis ante, ut mollis purus augue ut justo. Praesent felis urna, volutpat sit amet posuere dictum, luctus quis nibh. Proin et tristique ipsum, in aliquam ante.

Aenean eget ex mauris. Cras ut tempor quam. Curabitur eget nulla laoreet, bibendum neque porta, tempus nulla. Ut tellus nisi, semper eu ligula pretium, aliquam accumsan dolor. Nunc fermentum cursus arcu eu suscipit. Nam dolor tellus, efficitur sed condimentum at, fringilla eget nisi. Nulla luctus metus felis. Suspendisse potenti. Cras lacinia orci nec dui sodales commodo. Donec tellus arcu, congue porta ultrices non, pretium et sapien. Proin mattis risus dignissim feugiat tristique. Donec nibh lorem, facilisis id posuere ut, varius ac urna. Etiam ultrices dignissim mauris, quis venenatis ex semper ut.

Curabitur id fermentum erat, rhoncus scelerisque est. Sed pulvinar, nulla sed sollicitudin scelerisque, ipsum erat sollicitudin dolor, ut commodo arcu justo vel libero. Curabitur turpis dolor, fermentum ut elit a, vehicula commodo nunc. Sed sit amet blandit nulla, quis sodales massa. Donec lobortis, urna vel volutpat ullamcorper, mauris est efficitur nulla, et suscipit velit dui at metus. Aliquam id sem sed metus tristique scelerisque nec vitae odio. Phasellus a pellentesque libero, vel convallis metus. Sed nec fringilla magna, non varius ex. Sed interdum eleifend ligula, quis porta enim ultrices a. Donec hendrerit diam ac elementum tincidunt.

Pellentesque eget nisl rhoncus, malesuada justo nec, suscipit quam. Nam sodales mauris eu bibendum suscipit. Vivamus sodales dui lorem, scelerisque pellentesque diam fermentum sed. Etiam fermentum nisl id nisl blandit, sit amet semper erat ultricies. Nulla tincidunt nulla metus, eu imperdiet lorem malesuada sagittis. Maecenas accumsan risus sed ante eleifend, vitae pretium leo porta. Suspendisse vitae eros vitae dui ornare condimentum id sit amet mauris. Etiam tincidunt consequat risus, eu posuere mi. Donec ut nunc eu massa porttitor suscipit nec nec neque. Suspendisse vitae tincidunt justo, sed molestie velit. Nullam pellentesque convallis ante, vel posuere libero blandit in.

Footnotes

[fn:4] Even if the user has set the HUGO_CODE_FENCE value to t (via variable, keyword or subtree property), the Hugo highlight shortcode will be used automatically instead of code fences if either (i) the user has chosen to either show the line numbers, or (ii) has chosen to highlight lines of code (See the ox-hugo documentation on {{{doc(source-blocks,Source Blocks)}}}).

[fn:3] This is a long footnote. It is so long that it gets auto-filled over multiple lines. But even then it should be handled fine.

[fn:2] Second footnote

[fn:1] First footnote