Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional nicknames and name variants to add #62

Closed
afeibus opened this issue Dec 18, 2023 · 6 comments
Closed

Additional nicknames and name variants to add #62

afeibus opened this issue Dec 18, 2023 · 6 comments

Comments

@afeibus
Copy link

afeibus commented Dec 18, 2023

traci = tracy (which you already have), tracie
falon = fallon, Fal, Fall, Fallie, Fally, Falcon, Lon, Lonnie (https://momlovesbest.com/fallon-name-meaning)
hillary = hilary
toni = tony, antonia, etc.
lindsay = lindsey, lindsie, lindsy
garrett = Barrett, Gare, Garrison, Gars, Gary, Jerry, Rhett, Variations: Garratt, Garret, Garrod, Jarrett, Jared, Jarratt, Jerrold (https://momlovesbest.com/garrett-name-meaning)
gareth = gary, gare
dacia = Daycia, Daisha, Dacya
marc = mark, marcus, etc.
sheri = sherry, sherryl, sheryl, sherri, cheri, cherie, etc.
dianne = diane, dian
angelika = angelica
miguel = Miguell, Miguael, Miguaell, Miguail, Miguaill, Miguayl, Miguayll = michael/mick (spanish version)
monika = monica, monique
michele = michelle
shelley = sheley, michelle, shellie, etc.
hayley = hailey, haylee, etc.
karl = carl
rosemary = rosemarie, marie, mary, rose, etc.
jalen = Jay, Jaye, Len, Lenny, Lennie, Jaylin, Alen, Al, Jaylen, Jaelen, Jaelin, Jaelyn, Jailyn, Jaylyn
rachael = rachel
kellie = kelli, kelly, kelley
kalli = kali, cali
jodi = jody
lori = lorrie, laurie, lorelei, etc.
shawn = shaun
allen = allan, alan, al
erika = erica
marcia = marcie, marsha
dona = donna
kristi = kristy, Christy, christine, christina, krista, etc.
norman = norm
chelsie = chelsey
stephine = stephanie, stephany, stephani
audree = audrey
kerri = kerry
fiona = fionna
savanna = savannah
bryanna = brianna, bri, briana, etc.
jaine = jane, jayne
leilani = lani
jesse = jessica, jess, jessie
abby = abbie
glenn = glen
carri = carrie, kari, kara
donn = don, donald
kym = kymberly, kim, kimberly, kimberli
gerri, geri = geraldine
nichole = nicky, nicki, nicholette, nicci, nicole
jamey = jaime, jamie
tami = tammie, tammy
derek = derick, derrick, derrek, rick, etc.
jenni = jennie, jenny
karin = karen
gabriela = gabriella
marni = marnie
dena = deena, dina, adina, adena
brittnie = brittany
juston = justin
lesli = leslie, lesley, les
kev = kevin
aga = athaga
carla = karla, carly
tiffanee = tiffany
staci = stacy, stacey, stacie
sara = sarah
katia = kate, katie
terri = teri, terrie, terry
ashly = ashley
jeanie = jeannie
matt = matthew, matthews
jillian = jill
laurel = laurie

(these all came from a registration list I'm working on)

@NickCrews
Copy link
Collaborator

Hi! Thanks for opening this! A few things:

  1. The relationship between canonical and nickname is not symmetrical. So Matt is a nickname for Matthew, but not vice versa. matt and Matthew are already present in the data, are you just not using this library correctly?
  2. I want to be conservative with what links are added, so that there aren't false positives. For instance, I'm skeptical of how common Barrett-Garrett is. 95% of your suggestions look good, but I want to leave out a few of the weirder ones.
  3. I can add these cases to the code, but only if you help me with the grunt work of formatting for me, putting these in the form CANONICAL,NICK0,NICK1,NICK2, etc

Let me know what you'd like to do!

@afeibus
Copy link
Author

afeibus commented Dec 18, 2023

I'm unclear on the file structure and how to decide whether something is a name or nickname (e.g, is Kari a real name or a nickname for Carrie?). Would need more info to help with this.

I'm ok if stuff gets left out, the point of my issue report was to try to close some of the holes. Some of the names I'd never seen before either, but then looked up and found they were common in other countries (e.g., Garrett is big in Ireland).

@NickCrews
Copy link
Collaborator

The file structure is CANONICAL,NICK0,NICK1,NICK2, NICK3,etc as you can see in the csv. Does that make sense?

Yeah the Kari/Carrie case is ambiguous. I would lean towards Carrie being the longer one and therefore the canonical one. But for the Sara/Sarah case, I think that is symmetrical, so you should have a line sarah,sara as well as sara,sarah. Just try your best and I can go through and give my 2 cents and we should be able to find something. Just trying to make it better than how it currently is, it doesn't need to be perfect.

@afeibus
Copy link
Author

afeibus commented Dec 27, 2023

names.csv
This is close, maybe not all the possible canonicals, but enough that code could look in the nicknames to find related nicknames as canonical names too.

NickCrews added a commit that referenced this issue Dec 28, 2023
Started with the changes suggested in
#62 (comment),
but cleaned them up and made a few
adjustments.
@NickCrews NickCrews mentioned this issue Dec 28, 2023
NickCrews added a commit that referenced this issue Dec 28, 2023
Started with the changes suggested in
#62 (comment),
but cleaned them up and made a few
adjustments.
@NickCrews
Copy link
Collaborator

NickCrews commented Dec 28, 2023

Thank you @afeibus ! I made a few adjustments, but most of them looked great. Thank you very much, your work is very much appreciated! If you want, take a look at the above linked change and double check that I didn't do any changes to your edits that you disagree with.

@NickCrews
Copy link
Collaborator

Closing as done, but if you find any problems with the tweaks I made please raise a new issue ( and link to this one)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants