Skip to content

Commit

Permalink
closes #4 - pass utf-8 strings to distance fxn
Browse files Browse the repository at this point in the history
  • Loading branch information
cardi committed Apr 23, 2017
1 parent 8bdcd69 commit d37a6a8
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions BotDigger.py
Expand Up @@ -275,8 +275,9 @@ def distanceDomain(domain, DomainDict, ccTldDict, tldDict):
return ("not a domain", sys.maxint)
(domain2LD, domain3LD, domain2LDs, domain3LDs) = extractLevelDomain(domain, ccTldDict, tldDict)
for popularDomain in DomainDict:
if Levenshtein.distance(domain2LD, popularDomain) < minDistance:
minDistance = Levenshtein.distance(domain2LD, popularDomain)
distance = Levenshtein.distance(domain2LD.decode('utf-8'), popularDomain.decode('utf-8'))
if distance < minDistance:
minDistance = distance
similarDomain = popularDomain
#debug
#sys.stdout.write("subdomain: %s, similarDomain: %s, minDistance: %d\n" % (subdomain, similarDomain, minDistance))
Expand Down

0 comments on commit d37a6a8

Please sign in to comment.