# Extract Emails from Text

This Notebook demonstrates how to extract email address from text containing names and email addresses. The Notebook also demonstrates how to use the `names` module to generate random names.

## Random name and email generation

First, install the names module

In [1]:
pip install names

Note: you may need to restart the kernel to use updated packages.


In [2]:
# Import the names module (https://pypi.org/project/names/)
import names

# Set the seed for the random umber generator to ensure the same outcome each time the code is run
names.random.seed(126)

# Iterate over the numbers 0 through 19
for i in range(20):

    # Draw random first and last names
    first = names.get_first_name()
    last = names.get_last_name()

    # Make a random email address from the first letter of first name and whole last name
    email = first[0].lower()+last.lower()+'@example.com'

    # Print names and email address
    print('\t'+first+' '+last+' ('+email+')')

	Bruce Dillon (bdillon@example.com)
	Joshua Tyson (jtyson@example.com)
	Nathan Wiederhold (nwiederhold@example.com)
	Maria Funderburk (mfunderburk@example.com)
	Ronald Traynor (rtraynor@example.com)
	Patricia Snow (psnow@example.com)
	Peggy Frey (pfrey@example.com)
	Roberto Thomas (rthomas@example.com)
	Thomas Cole (tcole@example.com)
	Brenda Orr (borr@example.com)
	Floyd Alexander (falexander@example.com)
	Gerald Holtz (gholtz@example.com)
	Craig Stewart (cstewart@example.com)
	Boyd Blanks (bblanks@example.com)
	David Streett (dstreett@example.com)
	Julia Larsen (jlarsen@example.com)
	Rosalie Darrow (rdarrow@example.com)
	Dianna Williams (dwilliams@example.com)
	Walter Reed (wreed@example.com)
	Derrick Dulany (ddulany@example.com)


## Email address extraction

Copy and paste text above and save as a multiline string

In [3]:
text = '''Bruce Dillon (bdillon@example.com)
Joshua Tyson (jtyson@example.com)
Nathan Wiederhold (nwiederhold@example.com)
Maria Funderburk (mfunderburk@example.com)
Ronald Traynor (rtraynor@example.com)
Patricia Snow (psnow@example.com)
Peggy Frey (pfrey@example.com)
Roberto Thomas (rthomas@example.com)
Thomas Cole (tcole@example.com)
Brenda Orr (borr@example.com)
Floyd Alexander (falexander@example.com)
Gerald Holtz (gholtz@example.com)
Craig Stewart (cstewart@example.com)
Boyd Blanks (bblanks@example.com)
David Streett (dstreett@example.com)
Julia Larsen (jlarsen@example.com)
Rosalie Darrow (rdarrow@example.com)
Dianna Williams (dwilliams@example.com)
Walter Reed (wreed@example.com)
Derrick Dulany (ddulany@example.com)'''

Print `text` and notice that new lines are indicated by the `\n` character.

In [4]:
print(text)

Bruce Dillon (bdillon@example.com)
Joshua Tyson (jtyson@example.com)
Nathan Wiederhold (nwiederhold@example.com)
Maria Funderburk (mfunderburk@example.com)
Ronald Traynor (rtraynor@example.com)
Patricia Snow (psnow@example.com)
Peggy Frey (pfrey@example.com)
Roberto Thomas (rthomas@example.com)
Thomas Cole (tcole@example.com)
Brenda Orr (borr@example.com)
Floyd Alexander (falexander@example.com)
Gerald Holtz (gholtz@example.com)
Craig Stewart (cstewart@example.com)
Boyd Blanks (bblanks@example.com)
David Streett (dstreett@example.com)
Julia Larsen (jlarsen@example.com)
Rosalie Darrow (rdarrow@example.com)
Dianna Williams (dwilliams@example.com)
Walter Reed (wreed@example.com)
Derrick Dulany (ddulany@example.com)


Split on the new line character.

In [5]:
lines = text.split('\n')

lines

['Bruce Dillon (bdillon@example.com)',
 'Joshua Tyson (jtyson@example.com)',
 'Nathan Wiederhold (nwiederhold@example.com)',
 'Maria Funderburk (mfunderburk@example.com)',
 'Ronald Traynor (rtraynor@example.com)',
 'Patricia Snow (psnow@example.com)',
 'Peggy Frey (pfrey@example.com)',
 'Roberto Thomas (rthomas@example.com)',
 'Thomas Cole (tcole@example.com)',
 'Brenda Orr (borr@example.com)',
 'Floyd Alexander (falexander@example.com)',
 'Gerald Holtz (gholtz@example.com)',
 'Craig Stewart (cstewart@example.com)',
 'Boyd Blanks (bblanks@example.com)',
 'David Streett (dstreett@example.com)',
 'Julia Larsen (jlarsen@example.com)',
 'Rosalie Darrow (rdarrow@example.com)',
 'Dianna Williams (dwilliams@example.com)',
 'Walter Reed (wreed@example.com)',
 'Derrick Dulany (ddulany@example.com)']

Iterate over the lines and split on the `(` character.

In [6]:
for line in lines:
    email = line.split('(')
    print(email[-1][:-1])

bdillon@example.com
jtyson@example.com
nwiederhold@example.com
mfunderburk@example.com
rtraynor@example.com
psnow@example.com
pfrey@example.com
rthomas@example.com
tcole@example.com
borr@example.com
falexander@example.com
gholtz@example.com
cstewart@example.com
bblanks@example.com
dstreett@example.com
jlarsen@example.com
rdarrow@example.com
dwilliams@example.com
wreed@example.com
ddulany@example.com
