# Learning from:

Getting Started with Pyparsing by Paul McGuire Publisher: O'Reilly Media 
http://shop.oreilly.com/product/9780596514235.do


In [45]:
from pyparsing import *

### "Hello World ! on Steroids" 
page 9. 

The task is to write a parser for these strings:

Hello, World! <br>
Hi, Mom! <br>
Good morning, Miss Crabtree!   
Yo, Adrian!   
Whattup, G? <br>
How's it goin', Dude? <br>
Hey, Jude! <br>
Goodbye, Mr. Chips! <br>




Giving the input values with a list of strings:

In [46]:
tests=['Hello, World!', 'Hi, Mom!', 
      'Good morning, Miss Crabtree!',
      'Yo, Adrian!',
      'Whattup, G?',
      'Hey, Jude!',
      'Goodbye, Mr. Chips!',
      'How\'s it going\', Dude?']

Printing the input values to check:

In [47]:
print(tests)

['Hello, World!', 'Hi, Mom!', 'Good morning, Miss Crabtree!', 'Yo, Adrian!', 'Whattup, G?', 'Hey, Jude!', 'Goodbye, Mr. Chips!', "How's it going', Dude?"]


"The first step is to identify the pattern that they all follow" <br>

writing this pattern as a BNF:

greeting ::= salutation comma greetee endpunc
salutation ::= word+ <br>
comma ::= , <br>
greetee ::= word+ <br>
word ::= a collection of one or more characters, which are any alpha or 'or <br>
endpunc ::= ! | ? <br>




In [48]:
word = Word(alphas+"'.")
salutation = OneOrMore(word)
comma = Literal(",")
greetee = OneOrMore(word)
endpunc = oneOf("! ?")
greeting = salutation + comma + greetee + endpunc



the greeting variable has the 'formula' for the appropriate parse and is like an object that can do the parse and other operations. Doing the parse for the element 3 in the list (arrays in python begin with the 0 element).

In [55]:
greeting.parseString(tests[2])

(['Good', 'morning', ',', 'Miss', 'Crabtree', '!'], {})

Doing the parse for all the items in the list

In [54]:
for t in tests:
    view = greeting.parseString(t)
    print(view)

['Hello', ',', 'World', '!']
['Hi', ',', 'Mom', '!']
['Good', 'morning', ',', 'Miss', 'Crabtree', '!']
['Yo', ',', 'Adrian', '!']
['Whattup', ',', 'G', '?']
['Hey', ',', 'Jude', '!']
['Goodbye', ',', 'Mr.', 'Chips', '!']
["How's", 'it', "going'", ',', 'Dude', '?']


"to identify the tokens that compose the initial part of the greeting--the salutation--we need to iterate over the results until we reach the comma token:"

In [49]:
for t in tests:
    results = greeting.parseString(t)
    salutation = []
    for token in results:
        if token == ",": break
        salutation.append(token)
    print(salutation) 
        
    

['Hello']
['Hi']
['Good', 'morning']
['Yo']
['Whattup']
['Hey']
['Goodbye']
["How's", 'it', "going'"]
