# Task 1: Regular Expressions

*by Lukas Dötlinger*

## Patterns

Regular expression for **e-mail addresses**: `([\w.+-]+@[A-Za-z0-9-]+\.[A-Za-z0-9-.]+)`

`[\w.+-]+` matches any alphanumeric character and underscore, dot, plus and minus, at least once. Subsequently a `@` is matched, followed by the domain. The domain consists of at least two domain names concatinated by a dot.

Regular expression for **phone numbers**: `(((\+|00)[0-9]{2}|0)-[0-9]{3,4}-[0-9]+)`

The expressions matches phone numbers consisting of three parts sperated by `-`. The first part represents the country code, which can either be `0` indicating country of call origin, or `(\+|00)[0-9]` representing a country code preceeded by a `+` or `00`. The second part determines a local region or provider network consisting of three or four numbers. The last part contains one or more numbers.

## Example

In [24]:
import re
from pprint import pprint

emails = """
  test@test.com
  valid_test@test.ac.at
  valid-test2@test-xyz.at
  invalid:@test
  invalid@test_xy.at
"""

print("Found addresses:")
pprint(re.findall("([\w.+-]+@[A-Za-z0-9-]+\.[A-Za-z0-9-.]+)", emails))

phone_numbers = """
  +43-676-6767677899890
  0043-664-56743
  +43-5352-94221
  0-512-987654321
  +4-676-6
  0-51223-987654321
"""

print("\nFound phone numbers:")
pprint(re.findall("(((\+|00)[0-9]{2}|0)-[0-9]{3,4}-[0-9]+)", phone_numbers))

Found addresses:
['test@test.com', 'valid_test@test.ac.at', 'valid-test2@test-xyz.at']

Found phone numbers:
[('+43-676-6767677899890', '+43', '+'),
 ('0043-664-56743', '0043', '00'),
 ('+43-5352-94221', '+43', '+'),
 ('0-512-987654321', '0', '')]
