Skip to content

Commit

Permalink
reformatted with fold program
Browse files Browse the repository at this point in the history
  • Loading branch information
antquinonez committed Jan 31, 2017
1 parent 1c170df commit 98b6e38
Showing 1 changed file with 86 additions and 83 deletions.
169 changes: 86 additions & 83 deletions doc/Language/grammar_tutorial.pod6
Expand Up @@ -27,10 +27,10 @@ language.
=SUBTITLE The Broad Concept of Perl Grammars
Regular Expressions (regex) work well to find patterns in strings and manipulate
them. However, when you need to find multiple patterns at once, or need to
combine patterns, or test for patterns that may surround strings, or other
patterns - regular expressions alone are not adequate.
Regular Expressions (regex) work well to find patterns in strings and
manipulate them. However, when you need to find multiple patterns at once, or
need to combine patterns, or test for patterns that may surround strings, or
other patterns - regular expressions alone are not adequate.
Grammars provide a way to define how you want to examine a string using regular
expressions, and you can group these regular expressions together to provide
Expand Down Expand Up @@ -74,8 +74,8 @@ by the names you used to define your methods.
Now, you may be wondering, if I have all these regexes defined that just return
their results, how does that help with parsing things that may be forwards or
backwards in a string, or things that need to be combined from multiple of those
regexes... and that's where grammar actions come into play.
backwards in a string, or things that need to be combined from multiple of
those regexes... and that's where grammar actions come into play.
For every "method" you match in your grammar, you get an action you can call to
do something funny or clever with that match. You also get an over-arching
Expand All @@ -87,15 +87,15 @@ called TOP by default. We'll get more into this as well.
=head2 The technical overview
Grammars are defined just like a class, but using the I<grammar> keyword in
place of class. The "methods" in grammars are called either
I<regex, token or rule>. Regex methods are slow but thorough -- they will look
back in the string and really try. Token methods are much faster and they ignore
whitespace. Rule methods are the same as token methods except that they pay
attention and consume whitespace in your "regex" definitions.
place of class. The "methods" in grammars are called either I<regex, token or
rule>. Regex methods are slow but thorough -- they will look back in the string
and really try. Token methods are much faster and they ignore whitespace. Rule
methods are the same as token methods except that they pay attention and
consume whitespace in your "regex" definitions.
When a method (regex, token or rule) matches in the grammar, that matched string
is put into the Match object that will eventually be returned, and it will be
keyed with the same name as the method you chose to name it.
When a method (regex, token or rule) matches in the grammar, that matched
string is put into the Match object that will eventually be returned, and it
will be keyed with the same name as the method you chose to name it.
=begin code
grammar My::Gram {
Expand All @@ -117,8 +117,8 @@ parse doesn't match the TOP regex, your returned match object will be empty
As you can see above, in TOP, the "<thingy>" token is mentioned. The <thingy>
is defined on the next line, "token thingy...". That means that
'clever_text_keyword' B<must> be the first thing in the string passed in, or the
grammar parse will fail, and we'll get an empty match. This is great for
'clever_text_keyword' B<must> be the first thing in the string passed in, or
the grammar parse will fail, and we'll get an empty match. This is great for
recognizing malformed stuff that someone might give you that should be thrown
away.
Expand All @@ -128,27 +128,26 @@ Let's suppose we'd like to parse a URL into the component parts that make up a
RESTful request. Let's decide that we want the URLs to work like this:
=item The first part of the URI we'll call the "subject", like a part, or
a product, or a person.
=item The first part of the URI we'll call the "subject", like a part, or a
product, or a person.
=item The second part of the URI we'll call the "command", like standard
CRUD stuff (create, retrieve, update, or delete).
=item The second part of the URI we'll call the "command", like standard CRUD
stuff (create, retrieve, update, or delete).
=item The third part of the URI will be arbitrary data. Perhaps the
specific ID we'll be working with, or a long list of data separated
by "/"'s.
=item The third part of the URI will be arbitrary data. Perhaps the specific ID
we'll be working with, or a long list of data separated by "/"'s.
=item When we get a URL, we'll want 1-3 above to be placed into a nice data
structure we can use without having to do all sorts of splitting, and
that can be easily altered in the future or expanded upon (or extended).
=item When we get a URL, we'll want 1-3 above to be placed into a nice data
structure we can use without having to do all sorts of splitting, and that can
be easily altered in the future or expanded upon (or extended).
So if we got a URI on the server of "/product/update/7/notify" we would want our
grammar to give us a nice $match object that has a "I<subject>" of "product", a
"I<command>" of "update" and "I<data>" of "7/notify" (for now).
So if we got a URI on the server of "/product/update/7/notify" we would want
our grammar to give us a nice $match object that has a "I<subject>" of
"product", a "I<command>" of "update" and "I<data>" of "7/notify" (for now).
The first thing we do is define the grammar class. We're going to need to define
our subject, command and data as well. I think we'll use token for them, since
we don't care about whitespace in the regex.
The first thing we do is define the grammar class. We're going to need to
define our subject, command and data as well. I think we'll use token for them,
since we don't care about whitespace in the regex.
=begin code
grammar REST {
Expand All @@ -159,14 +158,14 @@ we don't care about whitespace in the regex.
=end code
So far this REST grammar says we want a subject that will be just I<word>
characters, a command that will be just I<word> characters, and data that
will be everything else left in the string (URI in this case).
characters, a command that will be just I<word> characters, and data that will
be everything else left in the string (URI in this case).
But in our big string we get, we don't know what order these regex matches will
come in. We need to be able to place these matching tokens in the larger context
of our URI we'll be passing in as that string. That's what the TOP method is
for. So we add it, and place our tokens by name within it, along with however
else our valid string should look, coming in.
come in. We need to be able to place these matching tokens in the larger
context of our URI we'll be passing in as that string. That's what the TOP
method is for. So we add it, and place our tokens by name within it, along with
however else our valid string should look, coming in.
=begin code
grammar REST {
Expand All @@ -192,8 +191,8 @@ that has all 3 parameters included:
Of course, the data can be accessed directly by using $match<subject> or
$match<command> or $match<data> to return the values parsed. They each contain
match objects you can work further with, or coerce into a string
( $match<command>.Str )
match objects you can work further with, or coerce into a string (
$match<command>.Str )
=head2 Adding some flexibility
Expand Down Expand Up @@ -221,10 +220,10 @@ have:
=end code
Let's imagine, for the sake of demonstration, that we might want to allow these
same URIs to be entered in by a user from the terminal. In that case, they might
put spaces between the '/'s, since users are prone to break things. If we wanted
to accommodate this possibility, we could replace the '/'s in TOP with another
token that allowed for spaces on either side of it.
same URIs to be entered in by a user from the terminal. In that case, they
might put spaces between the '/'s, since users are prone to break things. If we
wanted to accommodate this possibility, we could replace the '/'s in TOP with
another token that allowed for spaces on either side of it.
=begin code
grammar REST {
Expand Down Expand Up @@ -349,26 +348,27 @@ Let's look at various URIs and how they behave being passed through our grammar.
=end code
So with just this part of a grammar, we're getting almost everything we need.
Our URIs get efficiently parsed and we're given a nice little data structure for
the variables we need to work with.
Our URIs get efficiently parsed and we're given a nice little data structure
for the variables we need to work with.
But look at that first line returned -- the I<data> token is returning the
entire end of the URI as just one string. We need to be able to work with that 7
there. And that 4! Well, the 4 is easy... But the 7 had the extra /notify on the
end, to signal the system to notify someone that a product was updated (perhaps).
entire end of the URI as just one string. We need to be able to work with that
7 there. And that 4! Well, the 4 is easy... But the 7 had the extra /notify on
the end, to signal the system to notify someone that a product was updated
(perhaps).
So let's make sure we can do stuff with our regex tokens that were matched, such
as that I<data> token that returned a "7/notify". And to do so, we'll take
So let's make sure we can do stuff with our regex tokens that were matched,
such as that I<data> token that returned a "7/notify". And to do so, we'll take
advantage of another characteristic of these Grammar classes -- a thing called
actions.
=head1 Grammar Actions
We're going to diverge from our example for a moment to talk about Perl's
grammar actions. We're going to do this because, in many ways, grammar actions
are separate from the grammars you define. They are a completely different class
that you create, and use from your grammars to do stuff with the matches you
find in your grammars.
are separate from the grammars you define. They are a completely different
class that you create, and use from your grammars to do stuff with the matches
you find in your grammars.
You can think of grammar actions as a kind of plug-in expansion module for
grammars. A lot of the time you'll be happy using grammars just on their own.
Expand Down Expand Up @@ -396,12 +396,12 @@ defined as normal methods in your action class.
The only weird bit is that if you I<name your action methods with the same name
as your grammar methods> (tokens, regexes, rules), then when your grammar
methods match, your action method with the same name will get called
automatically for you, and it will be passed the match object from that specific
grammar token that matched.
automatically for you, and it will be passed the match object from that
specific grammar token that matched.
In other words, when you've attached an action class, name the methods in that
class with the same names you used in your grammar class, if you want actions to
be called automatically when grammar regexes match.
class with the same names you used in your grammar class, if you want actions
to be called automatically when grammar regexes match.
Matching actions will get passed the Match object from the grammar token as its
argument, and this can be represented by the $/ variable. This means that all
Expand Down Expand Up @@ -435,8 +435,8 @@ so that we can separate out an ID if we get it, from the rest of the long URL
that might follow, such as "7/notify" in our example.
To accomplish this we'll create an action class, and in it, create a method
with the same name as the named token, rule or regex we want to process. In this
case, our token is named "data".
with the same name as the named token, rule or regex we want to process. In
this case, our token is named "data".
=begin code
class REST-actions
Expand All @@ -445,10 +445,10 @@ case, our token is named "data".
}
=end code
Now when we pass the URL string through our grammar, the "data" token match will
be passed to the action class (REST-actions) to the method "data", and we'll
split that URI string by its '/' character. That way, the first element of the
returned list will be our ID number (7 in the case of "7/notify").
Now when we pass the URL string through our grammar, the "data" token match
will be passed to the action class (REST-actions) to the method "data", and
we'll split that URI string by its '/' character. That way, the first element
of the returned list will be our ID number (7 in the case of "7/notify").
But not really.
Expand All @@ -460,9 +460,9 @@ program. In order to make our action results show up, we need to call "make" on
that result, and that result can be many things, including strings, array or
hash structures.
You can imagine that the "make" we put on our action results, places that result
in a special, contained area in our whole grammar. Everything that we "make" for
data structures, can be accessed later by "made".
You can imagine that the "make" we put on our action results, places that
result in a special, contained area in our whole grammar. Everything that we
"make" for data structures, can be accessed later by "made".
So instead of our REST-actions class above, we should write
Expand Down Expand Up @@ -491,9 +491,10 @@ first element of the list returned from the "data" action we "made" with "make":
=end code
Here, we call "made" on our data, because we want the result of our action that
we "made" (with "make"), to get our split array. That's lovely! But, wouldn't it
be lovelier if we could "make" a friendlier data structure that contained all of
the stuff we want, rather than having to coerce types and remember arrays?
we "made" (with "make"), to get our split array. That's lovely! But, wouldn't
it be lovelier if we could "make" a friendlier data structure that contained
all of the stuff we want, rather than having to coerce types and remember
arrays?
Well, just like TOP in our grammar that over-arches and matches the entire
string, our actions have a TOP method as well. We can "make" all of our
Expand All @@ -520,14 +521,14 @@ So, our action class now might become
}
=end code
Here in our TOP method, our "subject" remains the same as the subject we matched
in our grammar. Also, our "command" returns the valid <sym> that was matched
(create, update, retrieve, or delete). Each we coerce into .Str as well,
since we don't need the full match object.
Here in our TOP method, our "subject" remains the same as the subject we
matched in our grammar. Also, our "command" returns the valid <sym> that was
matched (create, update, retrieve, or delete). Each we coerce into .Str as
well, since we don't need the full match object.
But what we want to be certain to do, is to use the "made" method on our $<data>
object, since we want to access that split one that we "made" with "make" in our
action, rather than the proper $<data> object.
But what we want to be certain to do, is to use the "made" method on our
$<data> object, since we want to access that split one that we "made" with
"make" in our action, rather than the proper $<data> object.
After we "make" something in the TOP method of a grammar action, we can then
access all of our custom-made stuff by calling the "made" method on our grammar
Expand Down Expand Up @@ -621,13 +622,15 @@ And this is the grammar and grammar actions that got us there, to recap:
=end code
Hopefully this has helped introduce you to the concept of grammars in Perl and
how grammars and grammar action classes can tie together. For further
information, insights and oddities, check out the more advanced
L<Perl Grammar Guide|https://docs.perl6.org/language/grammars>.
how grammars and grammar
action classes can tie together. For further information, insights and
oddities, check out the more
advanced L<Perl Grammar Guide|https://docs.perl6.org/language/grammars>.
Also if you could use some grammar debugging,
L<Grammar::Debugger|https://github.com/jnthn/grammar-debugger> should prove
handy. For quick debugging you get nice color-coded MATCH and FAIL output for
each of your grammar tokens, and if you like you can set breakpoints.
L<Grammar::Debugger|https://github.com/jnthn/grammar-debugger>
should prove handy. For quick debugging you get nice color-coded MATCH and FAIL
output for each of your grammar
tokens, and if you like you can set breakpoints.
=end pod

0 comments on commit 98b6e38

Please sign in to comment.