## Validate emails

To see rules of emails see (https://en.wikipedia.org/wiki/Email_address)(wiki page)

``local@domain``

rules for ``local``:
- uppercase and lowercase Latin letters A to Z and a to z digits 0 to 9
- printable characters ``!#$%&'*+-/=?^_`{|}~``
- dot ., provided that it is not the first or last character and provided also that it does not appear consecutively

rules for ``domain``
- uppercase and lowercase Latin letters A to Z and a to z; digits 0 to 9, provided that top-level domain names are not all-numeric;
- hyphen -, provided that it is not the first or last character.


In [8]:
import re

# read emails from emails.txt
with open('emails.txt', encoding='utf8') as fp:
    emails = fp.readlines()

# clean new spaces at the end of each line
emails = [x.rstrip() for x in emails]

print(len(emails))
print(emails)


100
['urfali/8vyb2@zpwwpfcxyp-.info.eg', 'çi̇men$f51v1@akiol-oolxu.io.kh', 'tuncer#clm5t@exzhx%ytoaz.com.tp', 'eki̇ci̇$x8zz2@evypc^ottyp.ai.fj', 'karasu}717nh@uweua&osuni.info.tg', 'albayrak<91g6b@rsfqx$ylrut.org.dm', 'koyuncu%6x0uc@syfrq?jfsdy.org.ec', 'özkaynar-9ykxt@bttwt!zgghi.gov.sv', 'kartal`bxdh9@dsvog]twjgd.com.sa', 'tülüce[jg2ks@cffti*dvrmp.info.mp', 'bozkuş]d6ukt@faawz[wgnzx.io.sl', 'göncü}9zxva@fqkea-xfzko.io.mm', 'çatuk]kp533@fdlkx}uyzmr.ai.lv', 'bağci&rddxs@mccgl`tbbut.ai.sn', 'büyükcam?gpj38@rmfah:oqzbu.org.cl', 'kayhan-cs1lp@tmdzv%otacv.io.yu', 'özdemi̇r@48vyo@nudgd|igfkn.info.ba', 'çinkit)wchlx@sjrwk|qavlk.io.vi', 'çoban<3qioq@rwtco{uokee.io.bw', 'çeti̇nkor~62tva@osxlv#pkpuj.com.al', "özek:kqnqr@bfqjo'myhmf.io.td", 'bolattürk.eqvpo@sozrh-yrwbz.gov.gt', 'kiriş[4z6je@sybfo:kludu.org.eh', 'ak`apl31@pepps:ordlf.io.ht', 'yazici erol]l0fpc@ohjsy=mbxhb.gov.tf', 'konuralp`q45rv@gcfah<ysycx.gov.lc', 'şengüleroğlu.xnzav@wmmtn*hjsrw.info.sh', 'çeli̇k}wbxtz@aywph:qevtm.ai.ml', 'boz

In [2]:
def validate_emails(emails, N, regex, section='all'):
    if section == 'local':
        emails1 = [x.split('@')[0] for x in emails[:N+1]]
    elif section == 'domain':
        emails1 = [x.split('@')[1] for x in emails[:N+1]]
    else:
        emails1 = emails[:N+1]
        
    for email in emails1:
        res = regex.fullmatch(email)
        if res == None:
            print(email, 'is invalid')
    

Lets find local that only has letters and digits

In [9]:
regex = re.compile('\w+')

validate_emails(emails,20,regex,'local')

urfali/8vyb2 is invalid
çi̇men$f51v1 is invalid
tuncer#clm5t is invalid
eki̇ci̇$x8zz2 is invalid
karasu}717nh is invalid
albayrak<91g6b is invalid
koyuncu%6x0uc is invalid
özkaynar-9ykxt is invalid
kartal`bxdh9 is invalid
tülüce[jg2ks is invalid
bozkuş]d6ukt is invalid
göncü}9zxva is invalid
çatuk]kp533 is invalid
bağci&rddxs is invalid
büyükcam?gpj38 is invalid
kayhan-cs1lp is invalid
özdemi̇r is invalid
çinkit)wchlx is invalid
çoban<3qioq is invalid
çeti̇nkor~62tva is invalid
özek:kqnqr is invalid


Now include printables to validate local

In [4]:
import string

#printables = r'!#$%&\'\*\+-/=?^_`{|}~'
exp = r'\w+[!#$`\%\{\}/\]\-\$]*\w*'
regex = re.compile(exp)

validate_emails(emails,100,regex,'local')

çi̇men$f51v1 is invalid
eki̇ci̇$x8zz2 is invalid
albayrak<91g6b is invalid
tülüce[jg2ks is invalid
bağci&rddxs is invalid
büyükcam?gpj38 is invalid
özdemi̇r is invalid
çinkit)wchlx is invalid
çoban<3qioq is invalid
çeti̇nkor~62tva is invalid
özek:kqnqr is invalid
bolattürk.eqvpo is invalid
kiriş[4z6je is invalid
yazici erol]l0fpc is invalid
şengüleroğlu.xnzav is invalid
çeli̇k}wbxtz is invalid
bozkurt.2rcou is invalid
celtemen[wx1k9 is invalid
akyol"5m33r is invalid
kocayi̇ği̇t'co2vs is invalid
fi̇li̇z?bf71s is invalid
öztürk=b3nyr is invalid
osmanca(i2l74 is invalid
demi̇rhan,j4gbd is invalid
yilmaz;rgov1 is invalid
çakmak[jtc3z is invalid
kandemi̇r/un4my is invalid
mağatur:0rhau is invalid
avcu*95mk0 is invalid
celi̇loğlu&6727y is invalid
bereket)j8q5m is invalid
efe\d4j8y is invalid
bağ>jry92 is invalid
karamanli+1arb5 is invalid
baklaci*zg2mf is invalid
ayhan*swe7e is invalid
karalar|s4bab is invalid
barişan<83foe is invalid
öztürk)kugs6 is invalid
dağ'2vszu is invalid
önal+zj13v i

Lets validate the domain

rules for ``domain``:
- uppercase and lowercase Latin letters A to Z and a to z; digits 0 to 9, provided that top-level domain names are not all-numeric;
- hyphen -, provided that it is not the first or last character.

In [10]:
regex = re.compile(r'[A-Za-z0-9_\.]+\-*[A-Za-z0-9_\.]+')

validate_emails(emails,100,regex,'domain')

exzhx%ytoaz.com.tp is invalid
evypc^ottyp.ai.fj is invalid
uweua&osuni.info.tg is invalid
rsfqx$ylrut.org.dm is invalid
syfrq?jfsdy.org.ec is invalid
bttwt!zgghi.gov.sv is invalid
dsvog]twjgd.com.sa is invalid
cffti*dvrmp.info.mp is invalid
faawz[wgnzx.io.sl is invalid
fdlkx}uyzmr.ai.lv is invalid
mccgl`tbbut.ai.sn is invalid
rmfah:oqzbu.org.cl is invalid
tmdzv%otacv.io.yu is invalid
sjrwk|qavlk.io.vi is invalid
rwtco{uokee.io.bw is invalid
osxlv#pkpuj.com.al is invalid
bfqjo'myhmf.io.td is invalid
sybfo:kludu.org.eh is invalid
pepps:ordlf.io.ht is invalid
ohjsy=mbxhb.gov.tf is invalid
gcfah<ysycx.gov.lc is invalid
wmmtn*hjsrw.info.sh is invalid
aywph:qevtm.ai.ml is invalid
orvzb\tycim.io.lv is invalid
azpoh]rfflx.com.lb is invalid
suvmv=uzsgu.info.gm is invalid
gxqga)yhdjt.com.vg is invalid
trqvd\znnel.io.cz is invalid
mgaih<mcsch.ai.my is invalid
dcbpm$qtuwx.ai.sj is invalid
wadhm'llwqa.gov.by is invalid
pcshn]coely.gov.sb is invalid
bxjky`xiedx.org.lr is invalid
zoukw;fgnzj.info.sv 

In [12]:
regex = re.compile(r'[A-Za-z0-9_\.]+[^(\-\.)]\-*[A-Za-z0-9_\.]+')

validate_emails(emails,100,regex,'domain')

gxqga)yhdjt.com.vg is invalid
wmgle)dxjtj.gov.tg is invalid
qmvxv(jaqug.info.ba is invalid
jddfn)afzff.ai.ci is invalid


In [13]:
regex = re.compile(r"^[a-zA-Z0-9.! #$%&'*+/=? ^_`{|}~-]+@[a-zA-Z0-9-]+(?:\. [a-zA-Z0-9-]+)*$")

validate_emails(emails, 100, regex)
                   
                
                   

urfali/8vyb2@zpwwpfcxyp-.info.eg is invalid
çi̇men$f51v1@akiol-oolxu.io.kh is invalid
tuncer#clm5t@exzhx%ytoaz.com.tp is invalid
eki̇ci̇$x8zz2@evypc^ottyp.ai.fj is invalid
karasu}717nh@uweua&osuni.info.tg is invalid
albayrak<91g6b@rsfqx$ylrut.org.dm is invalid
koyuncu%6x0uc@syfrq?jfsdy.org.ec is invalid
özkaynar-9ykxt@bttwt!zgghi.gov.sv is invalid
kartal`bxdh9@dsvog]twjgd.com.sa is invalid
tülüce[jg2ks@cffti*dvrmp.info.mp is invalid
bozkuş]d6ukt@faawz[wgnzx.io.sl is invalid
göncü}9zxva@fqkea-xfzko.io.mm is invalid
çatuk]kp533@fdlkx}uyzmr.ai.lv is invalid
bağci&rddxs@mccgl`tbbut.ai.sn is invalid
büyükcam?gpj38@rmfah:oqzbu.org.cl is invalid
kayhan-cs1lp@tmdzv%otacv.io.yu is invalid
özdemi̇r@48vyo@nudgd|igfkn.info.ba is invalid
çinkit)wchlx@sjrwk|qavlk.io.vi is invalid
çoban<3qioq@rwtco{uokee.io.bw is invalid
çeti̇nkor~62tva@osxlv#pkpuj.com.al is invalid
özek:kqnqr@bfqjo'myhmf.io.td is invalid
bolattürk.eqvpo@sozrh-yrwbz.gov.gt is invalid
kiriş[4z6je@sybfo:kludu.org.eh is invalid
ak`apl31