## Facebook Graph API

## Regular Expression

### what is re?
From wikipedia:
> a regular expression (abbreviated regex or regexp and sometimes called a rational expression) is a sequence of characters that define a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations.

### a first example

### components of re
+ characters (string-literals)
    + ABCabc123
+ meta-characters (operators)
    + `. ^ $ * + ? { } [ ] \ | ( )`
+ special sequences (short-cuts)
    + `\d \D \s \S \w \W`

#### `.`: anything except newline(`\n`)

#### bounders: `^ $`
+ `^`: (following) the beginning character
+ `$`: (preceding) the ending character

#### repeaters: `* + ?`
to specify the occurrence condition of its previous character
+ `*`: any occurrence (including zero)
+ `+`: at least one occurrence
+ `?`: zero or exactly one occurrence

#### advanced repeaters: `{m,n}`
at least `m` and at most `n` occurrence

example:
+ `a{1,3}`: character "a" must occur at least one and at most three time
+ `a{1,}`: character "a" must occur at least one time
+ `a{,3}`: character "a" must occur at most three time

#### `[]`: character class
+ to specify a set of characters that may occcur
+ most meta-char will lose their speacial meaning within the class
+ special meaning only triggered within the class:
    + `-`: ranger, e.g., `[0-9a-zA-Z]`
    + `^`: negator, e.g., `[^0-9]`

#### escaper: `\`
to have its following character **as-is**, i.e., stripping special meaning

#### `|`: or

#### grouper: `()`

#### special sequences
short-cut of specific character classes
+ `\d`: any decimal digit => `[0-9]`
+ `\D`: any non-digit => `[^0-9]`
+ `\s`: any white space => `[ \t\n\r\f\v]`
+ `\S`: any non-white space => `[^ \t\n\r\f\v]`
+ `\w`: any alphanumeric => `[a-zA-Z0-9_]`
+ `\W`: any non-alphanumeric => `[^a-zA-Z0-9_]`

## Regex in Python: Basic

### import python re module

In [9]:
import re

### compile a regex

In [77]:
p = re.compile("test+")
# return a pattern object
print p
# check methods
[me for me in dir(p) if callable(getattr(p, me)) and not "__" in me]

<_sre.SRE_Pattern object at 0x10fc8e1c0>


['findall', 'finditer', 'match', 'scanner', 'search', 'split', 'sub', 'subn']

#### the `match` method
to perform pattern matching **from the beginning** of a string

In [61]:
print p.match("123")

None


In [65]:
print p.match("123test")

None


In [66]:
print p.match("tes")

None


In [63]:
print p.match("test")

<_sre.SRE_Match object at 0x10fd04648>


In [102]:
print p.match("testtttt")

<_sre.SRE_Match object at 0x10fd09f38>


#### the `match` object

In [116]:
m = p.match("testtttt")
[me for me in dir(m) if callable(getattr(m, me)) and not "__" in me]

['end', 'expand', 'group', 'groupdict', 'groups', 'span', 'start']

In [110]:
# return the matched string
m.group()

'testtttt'

In [96]:
"the match starts at pos %s and ends at pos %s" % (m.start(), m.end())

'the match starts at pos 0 and ends at pos 8'

In [103]:
"the match starts at pos %s and ends at pos %s" % m.span()

'the match starts at pos 0 and ends at pos 8'

#### the `search` method
to perform pattern matching **from anywhere** of a string

In [129]:
m = p.search("123test456")
if m:
    print "pattern matched at pos %s to pos %s" % m.span()
else:
    print "no match"

pattern matched at pos 3 to pos 7


#### the `findall` and `finditer` method

In [135]:
p = re.compile("\d")
p.findall("24 hours a day; 8 in sleep and 8 for works")

['2', '4', '8', '8']

In [143]:
for m in p.finditer("24 hours a day; 8 in sleep and 8 for works"):
    print "\'%s\' found at %s" % (m.group(), m.span())

'2' found at (0, 1)
'4' found at (1, 2)
'8' found at (16, 17)
'8' found at (31, 32)


### implicit compile
methods like `search`, `match`, and `findall` can be called on module-level

In [159]:
m = re.findall("\d+", "1 round for Daan Park is 2350m")
print m

['1', '2350']


In [161]:
m = re.match("\d+", "1 round for Daan Park is 2350m")
print m.group() # notice that only the first matched will be return

1


## Regex in Python: Advanced

## Raw String Notation

## Reference
+ [Python official re tutorial](https://docs.python.org/2/howto/regex.html)
+ regex visualizer / tester:
    + [Debuggex](https://www.debuggex.com)
    + [regex101](https://regex101.com/#python)