Percy Notebooks

Acknowledgments

Special thanks to Derek Duin without whom these notebooks would not have been possible.

Run with Binder

These notebooks are interactive! To launch in a live environment click:

What is Diversity Analysis?

In order to achieve the stated goal of a diverse work environment, we need to be able to produce quantifiable measures of diversity. The challenge is that indicators of diversity such as national origin, veteran status, gender, etc. are sensitive and not reported in available datasets.

However, for any population that we wish to analyze we will always have, at a minimum a First and Last name.

In most cultures, there exist 'masculine' and 'feminine' names. However, there is no universal law that requires this. The result is that some names are strong predictors of sex such:

Elizabeth
Sarah
John
James

While others such as :

Casey
Jessie
Jordan
Pat

are not strong predictors.

Names as Predictors

Based on our own experiences we are likely to agree with the above names and their respective assignments. If our goal is to provide a quantifiable measure, we need some method to determine this.

Let's examine two popular approaches

Categorical

The categorical approach assigns names to categories based on their tendency to predict a sex. For instance we may see:

Male : John, James, Jordan
Female : Sarah, Elizabeth, Casey, Jessie
Strongly Male: John, James
Weakly Male: Jordan
Ambiguous: Pat
Strongly Female: Elizabeth, Sarah
Weakly Female: Jessie

Probabilistic

The probabilistic approach assigns discrete probabilities of sex for each name. We may see:

John: 0.05% Female
Sarah: 99.5% Female
Jordan: 26.0% Female
Jessie: 60.2% Female

Percy's diversity analysis is based on probabilistic data. The reasoning will become apparent later.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
notebooks		notebooks
LICENSE.txt		LICENSE.txt
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Percy Notebooks

Acknowledgments

Table of Contents

Run with Binder

What is Diversity Analysis?

Names as Predictors

Categorical

Probabilistic

About

Releases

Packages

Contributors 2

Languages

License

estasney/PercyNB

Folders and files

Latest commit

History

Repository files navigation

Percy Notebooks

Acknowledgments

Table of Contents

Run with Binder

What is Diversity Analysis?

Names as Predictors

Categorical

Probabilistic

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages