Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case of illogical domain suggestion #20

Closed
demark opened this issue Mar 28, 2012 · 7 comments
Closed

Case of illogical domain suggestion #20

demark opened this issue Mar 28, 2012 · 7 comments

Comments

@demark
Copy link

demark commented Mar 28, 2012

Hi there,

i found case of illogical domain suggestion:

domains = ["ua.com", "ui.com"], input val is user@uo.com, suggestion: user@ua.com, but if you look at keyboard image - it's obvious that i've mispelled for domain ui.com

kb

I clearly understand that built-in algorithm doesn't solve all cases, but i suggest to think about how to solve similar cases.

@tombouctou
Copy link

What about hacking a js implementation of keyboard distance:
http://cpansearch.perl.org/src/KRBURTON/String-KeyboardDistance-1.01/KeyboardDistance.pm ?

@derrickko
Copy link
Member

Very interesting, that would actually be really cool (and hopefully more effective). Open to a pull request for it ;)

@Kreker
Copy link

Kreker commented Mar 29, 2012

var email_checker = function(email) {
//original email
this.email=email;
}
email_checker.prototype = {
email : false, //user email
potential_servers : [], //server list for incorrect email
username : false, //email username
//keyboard list
keyboard : [
['q', 'w', 'e', 'r', 't', 'y', 'u', 'i', 'o', 'p'],
['a', 's', 'd', 'f', 'g', 'h', 'j', 'k', 'l'],
['z', 'x', 'c', 'v', 'b', 'n', 'm']
],
//post servername list
get_server_names : {"yahoo":1, "google":1, "hotmail":1, "gmail":1, "me":1, "aol":1, "mac":1,
"live":1, "comcast":1, "googlemail":1, "msn":1,
"facebook":1, "verizon":1, "sbcglobal":1, "att":1, "gmx":1, "mail":1, "ymail":1, "ya":1, "yandex":1, "vk":1, "rambler":1, "list":1, "inbox":1, "tut":1},

/*
* Return email servername and save email username
* @return string
*/
get_server_from_email : function(email) {
    var parts = email.split('@');
    this.username = parts[0];
    var parts = parts[1].split('.');
    return parts[0];
},

/*
* Add potential server if servername matched servername with replaced letter 
* @param int key  -  letter-key in original email
* @param int current_row  -  row on the keyboard to be searched
* @param int keypos  -  keypos on the keyboard to be searched
* @param array server_name  -  splitted servername
* @return void
*/
server_checkout : function(key, current_row, keypos, server_name) {

    var new_letter = current_row[keypos];
    if (typeof old_letter != 'undefined')

    if (new_letter) {
        old_letter = server_name[key];
        server_name[key] = new_letter; //replace letter

        new_srv_name = server_name.join('');

        if (this.get_server_names[new_srv_name]) //server found!
            this.potential_servers.push(new_srv_name);

        server_name[key] = old_letter;
    }
},

/*
* Check email post server
* @return bool/object
*/
check :  function() {
    var server = this.get_server_from_email(this.email);

    if (this.get_server_names[server]) 
        return true;


    variants = [];
    //search variants
    var server_spl = server.split('')

    for (key = 0; key < server_spl.length; key++) { //checking every symbol

        //search keypos
        keypos = {row:0,pos:0};
        for (row in this.keyboard) {
            for (col in this.keyboard[row]) {
                if (this.keyboard[row][col] == server_spl[key]) {
                    keypos = {row:row,pos:col};
                }
            }   
        }
        //adjacent rows diapason
        var start = (keypos['row'] - 1 >= 0) ? keypos['row'] - 1 : 0;
        var end = (keypos['row'] + 1 <= 2) ? Number(keypos['row']) + 1 : 2;

        for (cur_row = start; cur_row <= end; cur_row++) { //search in adjacent rows
            //before key
            this.server_checkout(key, this.keyboard[cur_row], (keypos['pos'] - 1), server_spl);
            //current key in other row
            if (keypos['row'] != cur_row)
                this.server_checkout(key, this.keyboard[cur_row], keypos['pos'], server_spl);

            //after key
            this.server_checkout(key, this.keyboard[cur_row], (Number(keypos['pos']) + 1), server_spl);

        }
    }

    return this.potential_servers;
}

}
var res = new email_checker('kreker20@hmail.com').check();
if (res === true)
console.log('Mailserver correct');
else
console.log('Maybe '+res.join(' or '));

@Kreker
Copy link

Kreker commented Mar 29, 2012

fck parser.
expect suggestions

@hpshelton
Copy link
Contributor

The issue here is that the implementation doesn't allow for two domains to be the same distance apart:

dist = this.stringDistance(domain, domains[i]);
if (dist < minDist) {
    minDist = dist;
    closestDomain = domains[i];
}

Add console.log(dist + ' ' + domains[i]); in that section and you will see that the distance calculated for both "ua.com" and "ui.com" is 2, but the algorithm returns only the first suggestion. Reverse the two domains in the list and the correct suggestion is returned.

Two options I see are that the search could be extended to be "<=" and to return an array of multiple equidistant domains to the implementing application to choose amongst, or the keyboard distance can be used to return the closest if the first search returns multiple equidistant domains.

@hpshelton
Copy link
Contributor

An initial implementation is in one of my branches based on the Perl implementation referenced above. Mailcheck will fall back to the keyboard distance if the first pass returns multiple equidistant domains.

I'll make a pull request after some touch-up and more testing.

@derrickko
Copy link
Member

@hpshelton That's awesome. Really looking forward to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants