# A Data-Driven Approach to Job Discrimination Law
## CS481 Assignment 1: 
### Milestone 2: Building a compliant algorithmic hiring system...

In the last milestone, you became well acquainted with Sprawlmart’s latest dataset of candidates and a few algorithmic methods for evaluating the individuals based on their data points.  By interrogating their performance chiefly through the lens of the 4/5ths rule you determined that each of the techniques ran into non-insignificant litigatory risk of a disparate impact claim. Confident that your team of SLS-trained lawyers and Stanford Engineering team can minimize the risk, your team has decided to prototype its own algorithmic hiring system to give the HR team a better sense of how to develop a hiring pipeline that can still take advantage of advances in machine learning while also prioritizing the importance of unbiased assessments, protecting the firm from expensive and damaging lawsuits.


Milestone Outline:
* Feature selection
* Model selection 
* Model training & evaluation
* Justifying your approach

### Feature Selection


* First, strategically choose the subset of features you’d like to include in your model.  (In your writeup)

* Did you select any protected characteristics as features to include in your model?  If so, why? What legal claims, frivolous or substantial, might you face due to their inclusion?  How would you respond?  (in your writeup)

* Explain the process or standard by which you selected each field to include in your feature set.  (e.g. you ran a regression to determine feature importance, you picked those fields that were most intuitive to you, etc.). Why might this process be important in defending a claim against disparate impact? (in your writeup)


In [None]:
#### ---- Setup the features and dataset here ------ ####
def featureSelector(candidates_df):
    ###### ----------  BEGIN FEATURE SELECTION HERE ----- ####


    ##### -----------  END FEATURE SELECTION HERE ------- #####
candidates_info = featureSelector(candidates)

### Choosing a Model
* Detail the model specifications you've planned out.  Describe its architecture and how it arrives at decisions on each candidate in detail.  Your description should be rigorous yet simple enough for a non-technical law student to understand.  

* Build your model that you've laid out in part (a)! Again, you are free to make use of the algorithms and code provided to you in the first milestone. Noting the "build" command can be shocking - for those of you new to this, this really might just mean making some new "model" employees with different set values for their datafields, and playing around with the way you calculate distance between the model candidates - looking into different distance metrics (L1 vs. L2) on distance.  For those of you comfortable - go all in!

In [None]:
#### ---- Setup your model and dataset here - no need to use this template ------ ####
class compliantAlgorithmicModel():
    def __init__(self):
        pass
    
    def train(self, features, labels):
    
    
    
    def predict(self, features):
    

* Ok, now provide a breakdown of the candidates that were selected by your algorithm to move on. What was the breakdown of candidates chosen based on protected characteristics?  Does your algorithm pass the sniff test?  If not you'll have to go back and debug your model.  (Some common things to look for, does it seem to be just making blanket classifications - e.g. everyone is unqualified, or everyone is excellent, or all those with cultural fit are selected, etc.). Do the data points of the selected candidates match what you engineered your algorithm to optimize for? Note some outliers in your pool of selected candidates.


* Evaluate your prototype’s performance with relation to the 4/5ths rule. You can use the code below to perform your analysis. Is it compliant among all protected characteristics?

* Given your algorithm’s architecture and features utilized, does employing it create a risk for the company being liable under a disparate treatment claim? Explain your answer.

* If your algorithm does not comply with the 4/5ths rule, please explain how you would make a defense of its employment, if you were to encounter a claim of disparate impact?  Are you able to forge a case of business necessity? 

In [None]:
### --------------- RUN YOUR MODEL ON THE CANDIDATES ------------- ###
    # Start Code
    
    
    
    # End Code

In [None]:
### --------------- Evaluate your model's selections ------------- ###
    # Start Code
    
    
    
    # End Code

### Analyzing Performance:
"From the module casebook" - The folks from HR Engineering actually turned out to have an additional dataset of about 250 past/current employees with the same datapoints you encountered previously, how fortunate! Noting the importance of minimizing potential bias, the dataset was constructed to be diverse and well-balanced. Moreover, they also have a field marking the quality of each candidate’s performance during their most recent year of employment for Sprawlmart.  Employees were labeled as either satisfactory or unsatisfactory.  

In [None]:
### --------------- Loading the "Held-Out" Dataset ------------- ###
employeeData = loadModel("../data/employeeData.csv")

* Run your algorithm on this dataset with the satisfactory/unsatisfactory data field left out. You should now have a bunch of candidates “selected” from this dataset.  

* Now again, analyze your algorithm’s compliance to the 4/5ths rule.  You’ve already addressed the disparate treatment claim above so you should focus on disparate impact for the new dataset. If this were the dataset of candidates you were evaluating, would there now be a possible disparate impact claim?


In [None]:
### --------------- Run your model to produce predictions ------------- ###
compliantHiringAlg = compliantHiringAlg()
compliantHiringAlg.predict(employee)

In [None]:
### --------------- Evaluate your model's 4/5th status ------------- ###
    # Start Code
    
    
    
    # End Code

* Now, how accurate is your algorithm with respect to selecting candidates that actually performed satisfactorily?  How many false positives did you have - employees your algorithm selected that actually exhibited unsatisfactory performance? How many false negatives did you have - employees your algorithm did not select but actually had satisfactory performance?

* Using those calculations and the formulas below, calculate the precision and recall rates of your algorithm on this “held-out” dataset.   Intuitively, what does your precision and recall rate tell you about the performance of your algorithm on this dataset?


In [None]:
def calcAccuracy():
    # Start Code
    
    # End Code

def countNumFalsePositives():
    # Start Code
    
    # End Code
    
def countNumFalseNegatives():
    # Start Code
    
    # End Code

def calcRecall():
    # Start Code
    # TP / (TP + FN)
    # End Code
    
def calcPrecision():
    # Start Code
    # TP / (TP + FP)
    # End Code

### Iteration

* Implement the improvements you proposed.  Analyze your performance again under accuracy with our “held-out” dataset as well as 4/5ths compliance

In [None]:
#### - Implement Model Improvements for Disparate Impact - ####


* Recall from the casebook - "Using what you have learned in class, and the readings assigned, suggest three to five substantial adjustments you can make to improve your algorithm’s performance."

* Implement two of them and test your performance.  


In [None]:
#### - Implement Model Improvements for Performance Metrics - ####


* Did those changes affect your compliance to the 4/5ths rule at all?

* Finally using the formulas below, report your model’s adherence to statistical parity, and differential validity on this held-out dataset. What do these results tell you?


In [None]:
#### - Evaluate for 4/5ths compliance - ####


#### - Evaluate for fairness metrics - ####

def calcStatisticalParity():
    # Start Code
    # End Code

def calcDifferentialValidity():
    # Start Code
    # End Code

In [1]:
Additional Legal Issues or Places for Integration
On Disparate Impact:
1. What constitutes a protected group? ==> initially it may just be black or white but what about
black female selection rate versus white male selection rate. E.g. our black selection rate is 
good, our female selection rate is solid, but we practically reject every black female. 
2. At what stage do we consider 4/5ths discrimination, at screening, interviewing, hiring?
3. Besides, the selection rate, what other metrics can be used as additional evidence during litigation 
to prove or disprove disparate impact.


On Disparate Treatment:
1. How could an algorithm produce a "legitimate non-discriminatory reason" for which the similarly
situated individual was rejected. 
2. This begins to implicate plans for affirmative action ... we see that the algorithm is potentially
acting in a discriminatory manner. We have two options at the teams... we can

1. Intervene with human decision-making after the algorithm spits out scores/selections
2. Intervene by fundamentally changing the model's architecture - e.g. we implement a fairness
correcting measure etc.  
    - This opens a wide span of options in that there are so many ways in which we can tweak architecture
        - Begs the question as to whether there ought to be standardized procedure for tweaking the architecture?
        - Would that be valuable to companies to avoid litigation risk - a kind of safe harbor....
    - Key question - What is the existing case law on this - here is where we introduce to the Ricci v Weber
    Distinction

On the ADA: 
- You have voice
- WHat would an alternative route of interviewing look like for someone who requests it.

SyntaxError: invalid syntax (<ipython-input-1-6a6385f2efc2>, line 1)