Skip to content

Commit

Permalink
Merge pull request #2469 from tbrowder/note-update
Browse files Browse the repository at this point in the history
add some more info
  • Loading branch information
tbrowder committed Nov 3, 2018
2 parents b099266 + 7b9e1be commit bf2e279
Showing 1 changed file with 152 additions and 73 deletions.
225 changes: 152 additions & 73 deletions docs/rakudo-nqp-and-pod-notes.md
Expand Up @@ -2,30 +2,38 @@

## Traps for the Perl 6 programmer

+ **DO NOT use '$0' in match results** - The Perl 6 shorthand for a match variable '**$0**' doesn't
work in NQP. Instead, use **$/[0]** for the zeroeth match.
Note the parser will be very confused otherwise and it currently cannot point to the error.


+ **DO NOT use 'nqp::say'** - The routine '**say**' is an NQP built-in and it does not need
the '**nqp::**' prefix. You can sometimes get away with using '**nqp::say**' but, when you least
expect it, the parser will fail without a helpful error message.

+ **DO use 'nqp::die'** - As opposed to '**say**', '**die**' does need to be qualified with '**nqp::**'.
If used without the '**nqp::**' prefix, you sometimes may get a very unhelpful error message.

+ **BE WARNED about '$\<some-match-var>' inside a sub with a '$/' arg** - Use the full syntax for
a match variable ('**/$<some-match-var**') for more reliable (or at least self-documenting) results.

+ **BE WARNED about '$\<a-match-var>' versus '$\<a-match-var>\*\*1'** - The first form will result in a
scalar object while the '\*\*' form will result in an array. Either form may be appropriate
for the situation, but proper handling will vary for each.

+ **BE WARNED about "return if (...)" statements** - Sometimes they work and sometimes not. But the
failure message is usually good enough to find the offending code.

+ **DO NOT use '$0' in match results** - The Perl 6 shorthand for a
match variable '**$0**' doesn't work in NQP. Instead, use **$/[0]**
for the zeroeth match. Note the parser will be very confused
otherwise and it currently cannot point to the error.


+ **DO NOT use 'nqp::say'** - The routine '**say**' is an NQP built-in
and it does not need the '**nqp::**' prefix. You can sometimes get
away with using '**nqp::say**' but, when you least expect it, the
parser will fail without a helpful error message.

+ **DO use 'nqp::die'** - As opposed to '**say**', '**die**' does need
to be qualified with '**nqp::**'. If used without the '**nqp::**'
prefix, you sometimes may get a very unhelpful error message.

+ **BE WARNED about '$\<some-match-var>' inside a sub with a '$/'
arg** - Use the full syntax for a match variable
('**/$<some-match-var**') for more reliable (or at least
self-documenting) results.

+ **BE WARNED about '$\<a-match-var>' versus
'$\<a-match-var>\*\*1'** - The first form will result in a scalar
object while the '\*\*' form will result in an array. Either form
may be appropriate for the situation, but proper handling will vary
for each.

+ **BE WARNED about "return if (...)" statements** - Sometimes they
work and sometimes not. But the failure message is usually good
enough to find the offending code.

For example, all these failed:

```
return if !nqp::elems(@arr);
return unless nqp::elems(@arr);
Expand All @@ -37,29 +45,43 @@ if !nqp::elems(@arr) {
return;
}
```


## Pod compilation overview

Pod is parsed as it is discovered during the parsing phase of each
compilation unit. Each pod object (string, paragraph, block,
configuration, term, heading, item, etc.) is serialized as it is
completed, and that result is a QAST node. The appropriate assembly of
QAST nodes (which have also been marked as a *compile_time_constant*)
are grouped into instances of pod classes as defined in
**src/core/Pod.pm6**.

## Pod block text content handling

Text inside pod blocks that are contents rather than markup is comprised of
intermixed text and formatting code characters. Newlines and contiguous
whitespace may or may not be significant depending upon the general block type
(abbreviated, paragraph, delimited, or declarator) or block identifier (e.g.,
code, input, output, defn, comment, or data).
Text inside pod blocks that are contents rather than markup is
comprised of intermixed text and formatting code characters. Newlines
and contiguous whitespace may or may not be significant depending upon
the general block type (abbreviated, paragraph, delimited, or
declarator) or block identifier (e.g., *code*, *input*, *output*,
*defn*, *comment*, or *data*).

The content as it is parsed in Grammar.nqp is first broken into individual
characters which are then assigned to one of three token groups: regular text, text with
formatting code, and text that is to be unchanged from its input form
(code, input, and output).
The content as it is parsed in Grammar.nqp is first broken into
individual characters which are then assigned to one of three token
groups: regular text, text with formatting code, and text that is to
be unchanged from its input form (*code*, *input*, and *output*).

The regular text and intermingled formatted text are then divided into two more
categories: text that will form one or more paragraphs and text that is part
of a table. Ultimately, each paragraph of text should be grouped into the
@contents array of a single Pod::Block::Para, but not all pod handling per S26
has been fully implemented.
The regular text and intermingled formatted text are then divided into
two more categories: text that will form one or more paragraphs and
text that is part of a table. Ultimately, each paragraph of text
should be grouped into the @contents array of a single
**Pod::Block::Para**, but not all pod handling per S26 has been fully
implemented.

Some notable, not-yet-implemented (NYI) features (in order of one dev's TODO list)
Some notable, not-yet-implemented (NYI) features (in order of one
dev's TODO list)

1. NYI: %config :numbered aliasing with '#' for paragraph or delimited blocks
1. NYI: %config :numbered aliasing with '#' for paragraph or delimited
blocks

2. NYI: pod data blocks

Expand All @@ -69,7 +91,7 @@ Some notable, not-yet-implemented (NYI) features (in order of one dev's TODO lis

5. NYI: pod configuration aliasing

6. NYI: formatting code in declarator blocks
6. NYI: formatting code in declarator blocks (not described in S26, but a user-requested feature)

7. NYI: consistent use of the Pod::Block::Para as the leaf parent of all regular text

Expand All @@ -79,47 +101,102 @@ Some notable, not-yet-implemented (NYI) features (in order of one dev's TODO lis

10. NYI: nested delimited comment blocks

11. NYI: configuration data on continuation lines are not always handled correctly
11. NYI: configuration data on continuation lines are not always
handled correctly

Anyone wanting to work on any of the NYI items please coordinate on IRC #perl6-dev to
avoid duplicate efforts. Most of the items are being worked on in a generally logical
order of need and knowledge gained during the process of implementing pod features.
Anyone wanting to work on any of the NYI items please coordinate on
IRC #perl6-dev to avoid duplicate efforts. Most of the items are
being worked on in a generally logical order of need and knowledge
gained during the process of implementing pod features.

## The <pod_textcontent> token
## Pod nesting

Complicating work with pod is that pod blocks can be nested, i.e., a
pod block can have pod blocks as children, to any depth! Necessarily
that applies, in general, to *delimited blocks*. (Other block types
may have single blocks as children, usually as one or two
**Pod::Block::Paras**.)

One consequence of this is that a pod block with children cannot be
created until all its children have been created. Another consequence
is that a pod block can have several parts, some of which cannot be
created until child components are analyzed or created.

## Pod block parts

The token **pod_textcontent** is the match object for regular text and formatted code as
described above. It is the source of the final contents object for regular text containers
except for the table blocks which will be discussed separately. It has a corresponding action
method.
A pod block can have several parts, all of which must be created
before the block itself can be created. Those parts are:

Tracing the pod class building code is tedious and not well documented. Tokens in the grammar
are often made early, along with other objects, and attached to that token's match object's .ast
attribute which is then used later in another object. The developer who wants to change the called .ast
code in that other object (which may be in the grammar, actions, or src/Perl6/Pod.nqp) has to refer
back to the original make point to see its format before doing any changes--not fun!
There is an ongoing effort to better document the process for later developers.
* Configuration - `%.config` [all blocks inheriting from class **Pod::Block**]

Following is the start of a table to show the grammar tokens that have action methods.
* The configuration cannot be created until the block text data are analyzed.

Note that *abbreviated* blocks cannot have an explicit configuration
section, but they may have limited implicit configuration data throuse
use of a *:numbered alias* (see below).

* Contents - `@.contents` [all blocks inheriting from class **Pod::Block**]

* The contents cannot be created until all child blocks are created.

* Term - `$.term` [*defn* blocks]

* The term cannot be created until the block text data are analyzed.

* Caption - `$.caption` [*table* blocks]

* The caption cannot be created until the configuration is analyzed.

* Headers - `@.headers` [*table* blocks]

* The headers cannot be created until the block text data are analyzed.

## The <pod_textcontent> token

The token **pod_textcontent** is the match object for regular text and
formatted code as described above. It is the source of the final
contents object for regular text containers except for the table
blocks which will be discussed separately. It has a corresponding
action method.

Tracing the pod class building code is tedious and not well
documented. Tokens in the grammar are often made early, along with
other objects, and attached to that token's match object's .ast
attribute which is then used later in another object. The developer
who wants to change the called .ast code in that other object (which
may be in the grammar, actions, or src/Perl6/Pod.nqp) has to refer
back to the original make point to see its format before doing any
changes--not fun! There is an ongoing effort to better document the
process for later developers.

Following is the start of a table to show the grammar tokens that have
action methods.

| Grammar tokens | Action method? | Pod sub? |
| --- | --- | --- |
| pod_textcontent | yes

## :numbered aliasing

S26 allows for the '#' character (Unicode name **NUMBER SIGN**), as the first word in a block,
to turn on the **:numbered** %config key; in that case the '#' will be removed from the data.
The user can allow a '#' to be recognized as data by either (1) setting the %config numbered
key to false, typically with the **:!numbered** form, or (2) using the **V** formatting code
around the '#' in the data like this: **V<#>**.

Proper handling of this feature requires changing the block's %config hash after the block data have been
parsed or possibly changing the parsing of the first block data word due to the presence of **:!numbered** in
the %config hash. Another problem is how to handle duplicate or incompatible %config keys and values.
## :numbered aliasing

The easiest case to handle is the abbreviated block which cannot have explicit %config data and for
which the :numbered alias is most useful. Examples of the abbreviated blocks most likely to
use this option are the **=item**, **=head**, and **=defn** types.
S26 allows for the '#' character (Unicode name **NUMBER SIGN**), as
the first word in a block, to turn on the **:numbered** %config key;
in that case the '#' will be removed from the data. The user can
allow a '#' to be recognized as data by either (1) setting the %config
numbered key to false, typically with the **:!numbered** form, or (2)
using the **V** formatting code around the '#' in the data like this:
**V<#>**.

Proper handling of this feature requires changing the block's %config
hash after the block data have been parsed or possibly changing the
parsing of the first block data word due to the presence of
**:!numbered** in the %config hash. Another problem is how to handle
duplicate or incompatible %config keys and values.

The easiest case to handle is the abbreviated block which cannot have
explicit %config data and for which the :numbered alias is most
useful. Examples of the abbreviated blocks most likely to use this
option are the **=item**, **=head**, and **=defn** types.

The '#' turns on the **:numbered** configuration in all these cases:

Expand All @@ -142,12 +219,14 @@ not good practice but have to be handled gracefully:
# foo bar
```

The **:!numbered** is interpreted to mean accepting the '#' as part of block data.
The **:!numbered** is interpreted to mean accepting the '#' as part of
block data.

```
=for para :numbered
# foo bar
```

The '#' means the same as the **:numbered** option: the renderer should number the
paragraph and the two **:numbered** keys (one explict and one implicit) are redundant.
The '#' means the same as the **:numbered** option: the renderer
should number the paragraph and the two **:numbered** keys (one
explict and one implicit) are redundant.

0 comments on commit bf2e279

Please sign in to comment.