FairLens — Transparency in Every Hiring Decision
Transform opaque AI hiring models into auditable, fair systems. FairLens detects gender bias, income disparities, and hidden proxy variables in real-time using Demographic Parity metrics and the EEOC 80% Rule.
Tech Stack: React 18 + Vite | FastAPI + Pandas | Gemini 2.5 Flash | Tailwind CSS | Google Cloud Run
Modern AI hiring systems are black boxes. HR teams upload datasets to automated decision engines and receive a score—but they have no idea:
- Is the algorithm unfairly rejecting women candidates?
- Does it systematically disadvantage low-income applicants?
- Are "neutral" columns like Zip Code secretly encoding race or wealth?
The Impact: In 2023 alone, Amazon, Google, and Apple faced regulatory scrutiny over algorithmic bias in hiring. The EEOC's "Disparate Impact" standard requires employers to prove their models don't discriminate at a rate exceeding 20% (the 80% Rule). Yet 87% of organizations lack the tools to measure this.
The Gap: Existing bias audit tools require PhD-level statistics knowledge. They're too expensive, too slow, and they don't explain why bias exists or how to fix it.
FairLens automates fairness audits in three steps:
- Upload your CSV (any format, any column names).
- Map columns to roles (protected attribute, hiring decision, income level).
- Analyze — Get a fairness score (0-100) + Gemini-powered explanations + actionable fixes.
In 30 seconds, you have a complete picture of bias in your system.
FairLens directly addresses systemic discrimination in AI hiring. By making bias visible and measurable, we empower organizations to make fairer decisions. Our Demographic Parity metric ensures that protected groups are hired at equal rates, reducing the hidden inequalities that perpetuate economic exclusion.
How we meet SDG 10:
- Detect gender, race, and socioeconomic bias in automated decisions.
- Provide proxy variable detection to surface hidden discrimination (e.g., Zip Code as a proxy for race).
- Enable users to simulate bias reduction—showing the real impact of threshold adjustments on fairness.
Fair hiring processes unlock economic mobility for marginalized groups. When AI systems stop filtering out qualified candidates based on protected attributes, we see:
- Broader talent pools for employers.
- Better job matches for workers.
- Reduced turnover from demotivated, underrepresented teams.
How we meet SDG 8:
- Empower HR teams to identify and fix systemic barriers.
- Reduce hiring discrimination that blocks economic advancement.
- Support compliance with labor laws (EEOC, UK Equality Act 2010).
The Problem: CSV files use wildly different naming conventions. One dataset has "Gender," another has "Sex" or "protected_attr." Building bias detection tools means you have to normalize thousands of variations.
Our Solution: We built a zero-configuration Column Mapper that:
- Auto-detects common column names (Gender, Race, Hired, Decision, Income, etc.).
- Lets users manually map any CSV column to its role: Protected Attribute, Target Outcome, Socioeconomic Status.
- Handles 8+ gender encoding variants: M/F, male/female, 0/1, true/false.
- Normalizes target values: yes/no, hired/rejected, pass/fail, 1/0.
Example Flow:
User uploads hiring_2024.csv with columns: [Employee_ID, Gender_Type, Approved_Decision, Annual_Pay]
FairLens auto-detects: Gender_Type → Protected Attribute, Approved_Decision → Target, Annual_Pay → Income Level.
User clicks "Analyze" without touching a single column.
Why This Wins: Most bias tools require data scientists to preprocess CSVs. FairLens lets HR analysts work with their raw data immediately.
The Challenge: Bias reduction isn't free. Lower your hiring bar to be more inclusive, and you might hire less-qualified candidates. Raise it to maintain quality, and you exclude protected groups.
Our Solution: A real-time What-If Simulator that:
- Lets users drag a slider to adjust the decision threshold (0.0 = "accept everyone" to 1.0 = "accept no one").
- Shows live updates to Demographic Parity Difference at every threshold.
- Displays acceptance rates for each group, acceptance rate overall, and bias gap.
- Includes a mini bar chart showing how acceptance rates and bias change across thresholds.
- Provides fairness verdicts: "At threshold 0.65, bias is 8%—within acceptable range."
Example: A user sees that at threshold 0.5, female candidates are hired 25% less than males (bias = 25%, HIGH RISK). They drag to 0.4, and the bias drops to 12% (MODERATE). They accept 15% fewer candidates overall but fix the gender bias.
Why This Wins: Transparency + agency. Users understand the fairness-accuracy tradeoff instead of blindly accepting an algorithm's decisions.
The Hidden Problem: A hiring model may never explicitly see "Race" or "Socioeconomic Status," but if it uses Zip Code, Neighborhood, School District, or Years of Education, it's indirectly discriminating—these are proxies for protected attributes.
Our Solution: Correlation-Based Proxy Detection that:
- Computes Pearson correlation between every numeric column and the protected attribute.
- Flags columns with |r| > 0.3 as "red flag" proxy variables.
- Displays correlation strength for each proxy (e.g., "Zip_Code: r=0.69").
- Explains why they're harmful: "Zip Code correlates 0.69 with gender—likely encodes wealth or race."
Real Example from Our Dataset:
- Gender and Zip_Code: |r| = 0.69 ✗ HIGH RISK
- Gender and Age: |r| = 0.12 ✓ Safe
Why This Wins: Regulators and researchers often miss proxies. We catch them automatically, preventing the "we didn't know" defense.
The Translation Problem: Bias metrics are technical. HR managers don't understand "Demographic Parity Difference = 25%." They need plain English.
Our Solution: We feed our bias metrics to Gemini 2.5 Flash with a custom prompt:
You are a fairness auditor. Given these hiring bias metrics:
[gender_bias: 20%, income_bias: 15%, proxy_variables: ['Zip_Code']]
Write 3 short paragraphs for an HR team:
1. What the numbers mean
2. Why this bias likely exists
3. Three concrete steps to fix it
Use simple language, no jargon.
Output Example: "Your model hires women at a 20% lower rate than men. This gap is statistically significant and likely violates the EEOC's 80% Rule. To fix it, (1) reweight your training data to balance gender representation, (2) remove Zip Code as a feature (it's a proxy for wealth), (3) test your updated model on a holdout set before deployment."
Why This Wins: Makes fairness actionable. Non-technical stakeholders can understand AND act.
We don't just give one "bias score." We provide three independent metrics:
Metric 1: Demographic Parity Difference (Protected Attribute)
- Formula: |P(hired | Group A) - P(hired | Group B)| × 100
- Threshold: < 10% (EEOC guidance)
- Example: If 60% of men are hired and 35% of women are hired, DPD = 25% (HIGH RISK).
Metric 2: Disparate Impact Ratio (Socioeconomic)
- Formula: P(hired | low-income) / P(hired | high-income)
- Threshold: ≥ 80% (EEOC 80% Rule)
- Example: If 40% of low-income and 60% of high-income candidates are hired, ratio = 67% (FAILS 80% rule, bias = 33%).
Metric 3: Proxy Variable Correlation
- Formula: |Correlation(column, protected_attribute)| > 0.3
- Example: Zip Code correlates r=0.69 with gender → RED FLAG.
Overall Fairness Score: Weighted average (50% Demographic Parity + 30% Socioeconomic + 20% Proxy).
User → Frontend (React/Vite) → Backend (FastAPI) → Gemini 2.5 Flash → PDF Report
- App.jsx: React Router entry point. Manages navigation between Landing, Upload, Results pages.
- UploadPage.jsx: CSV file drop zone + Column Mapper. Uses PapaParse to read headers on-the-fly. Sends column mapping JSON to backend.
- ResultsPage.jsx: Displays bias metrics, charts, Gemini explanation, What-If Simulator. Calls
/export-pdfendpoint. - Styling: Tailwind CSS + custom Google-style design tokens (blue: #4285F4, red: #EA4335, etc.).
- POST /analyze: Accepts file + column_map (JSON). Calls
calculate_all_metrics()frombias_calculator.py. Streams results + Gemini explanation. - POST /simulate: Accepts file + column_map + list of thresholds. Returns bias metrics at each threshold for the What-If Simulator.
- POST /export-pdf: Accepts results JSON. Uses ReportLab to generate a professional PDF report with bias metrics, charts, and Gemini analysis.
- CORS Enabled: Frontend can call backend from localhost or cloud domain.
- clean_dataset(): Removes rows with missing protected/target values. Normalizes gender/target values (8 variants).
- calculate_all_metrics(): Computes demographic parity, disparate impact, proxy correlations.
- simulate_threshold(): Recomputes metrics at a specific decision threshold.
- Sends bias_results dict to Gemini 2.5 Flash.
- Receives 3-paragraph fairness explanation.
- Handles API errors gracefully (fallback message if API fails).
- User uploads
hiring_2024.csv+ maps Gender → Protected, Hired → Target. - Frontend sends FormData to
POST /analyze. - Backend reads CSV, normalizes values, computes metrics.
- Backend calls Gemini: "Here are the metrics... explain for HR."
- Gemini returns explanation.
- Backend returns { overall_bias_score: 35.5, gender_bias: { percentage: 20, ... }, explanation: "...", ... }.
- Frontend renders Results page with charts, explanation, simulator.
- User drags What-If slider → frontend calls
POST /simulate→ live chart updates. - User clicks "Export PDF" → backend generates and downloads PDF.
Why This Exists:
- Real-world CSVs have chaos: "Gender," "Sex," "protected_attr," "demographic."
- We auto-detect common names but let users override.
- Supports 8 gender variants + flexible target encoding.
UI Flow:
- User drags CSV → auto-detects columns.
- Dropdowns show: [Gender Type, Hired Decision, Annual Income, ...].
- User selects: Gender Type = Protected Attribute, Hired Decision = Target.
- Mapping summary shows: "Gender Type → Protected | Hired Decision → Outcome."
- Click "Analyze with Gemini" → sends JSON:
{ "protected_attr": "Gender Type", "target_col": "Hired Decision", ... }.
Why This Exists:
- Users need to understand fairness-accuracy tradeoffs.
- A slider that adjusts threshold in real-time and shows bias impact is infinitely more useful than a static report.
UI Flow:
- Slider ranges from 0 (accept everyone) to 1 (accept no one).
- At each slider position, frontend calls
computeFrontendSim(threshold):- Uses existing group rates (rateA, rateB) from analysis.
- Scales acceptance rate by (1 - threshold).
- Recomputes bias gap at new rates.
- Live tiles update: Group A Rate, Group B Rate, Bias Gap %, Acceptance Rate %.
- Mini SVG bar chart shows bias across thresholds (pure SVG, no external chart library required).
- Fairness verdict below chart: "At threshold 0.65, bias is 8%—within acceptable range."
Why This Exists:
- Regulators care about intent. Even if your model never sees "Race," using Zip Code = indirect discrimination.
- We detect this automatically by computing Pearson correlation.
Algorithm:
- Convert protected attribute to numeric (0/1 encoding).
- For each numeric column (Age, Zip_Code, Education_Years, etc.):
- Compute |correlation(column, protected_numeric)| .
- If > 0.3, flag as proxy variable.
- Return dict:
{ "Zip_Code": 0.69, "Education_Years": 0.31 }.
- Node.js 16+ (npm)
- Python 3.9+
- Gemini API key (from Google AI Studio)
Step 1: Install dependencies
npm installStep 2: Create .env file
VITE_API_URL=http://localhost:8000
Step 3: Start dev server
npm run devFrontend runs at http://localhost:5173.
Step 1: Install dependencies
cd backend
pip install -r requirements.txtStep 2: Create .env file
GEMINI_API_KEY=your_api_key_here
Get your key from: https://aistudio.google.com/app/apikey
Step 3: Run FastAPI server
uvicorn main:app --reloadBackend runs at http://localhost:8000.
- 80-100: Mostly Fair. Celebrate! But monitor.
- 60-79: Moderate Concern. Review proxy variables. Adjust threshold.
- 0-59: High Risk. Urgent action required.
- < 10%: Low risk. Acceptable.
- 10-25%: Moderate. Review features & training data.
- > 25%: Critical. Apply reweighting or adversarial debiasing.
- Ratio ≥ 80%: Pass (within EEOC 80% Rule).
- Ratio < 80%: Fail. Violates EEOC guidance.
- |r| > 0.3: Flag. Consider removing or reweighting.
- |r| ≤ 0.3: Safe.