This repository outlines a comprehensive program for setting up an in-house Trustworthy AI initiative or capability group. It spans areas beyond technology itself, including ethics, law, social sciences, and philosophy. The goal is to build a Body of Knowledge (BoK) for AI auditors within organizations.
flowchart TD
A("fa:fa-book-open AI Auditor BoK") -- Builds --> B("fa:fa-code Technical Expertise")
A -- Follows --> C("fa:fa-comment-dots Regulatory & Ethics")
A -- Leverages --> D("fa:fa-shapes Auditor Skills")
B -- Feeds into --> E("Organizational AI&ML BoK")
C -- Feeds into --> E
D -- Feeds into --> E
E -- Builds --> F("Trustworthy AI Capability")
style B color:#424242, fill:#AA00FF, stroke:#AA00FF,fill:#BBDEFB,stroke:#FFF9C4
style C color:#FFFFFF, stroke:#00C853, fill:#00C853,fill:#2962FF,stroke:#FFF9C4
style D color:#FFFFFF, stroke:#2962FF, fill:#2962FF,fill:#757575,stroke:#FFF9C4
style F color:#FFFFFF, fill:#AA00FF, stroke:#AA00FF,fill:#FF6D00,stroke:#FFF9C4
Disclaimer
Although I harbour encyclopedical ambitions, this tsundoku-ish repo can only be a work in progress, part learning journey, part intellectual pursuit, and does not intend to be a final one stop shop.
The ultimate goal is to build a BoK for a team of AI Auditors inside an organization, according to the definition of Body of Knowledge offered by Wikipedia.
No affiliation links whatsoever.
- Transparency & Explainability
- Fairness & Bias
- Privacy & Security
- Robustness & Reliability
- Safety & Alignment
- Systematic Auditing of AI Models
- Audit Process & Methodology
- Tools & Techniques
- Documentation Standards
- Specialized Auditing Skills
- Soft Skills for AI Auditors
- Code Examples
- Tools, Templates & Checklists
- Commercial Auditing Tools
- Training & Certifications
- Books & Papers
- Vendor Resources
Detailed exploration of the concept
The topic of Trustworthy AI has garnered significant attention due to the rapid development and deployment of AI technologies. This interest is driven by the need to ensure that AI systems are safe, fair, explainable, and accountable.
Six crucial dimensions in achieving trustworthy AI:
- Safety & Robustness
- Nondiscrimination & Fairness
- Explainability
- Privacy
- Accountability & Auditability
- Environmental Well-being
Key references:
The five foundational ethical principles for AI:
- Beneficence - AI should be designed to benefit humanity
- Non-maleficence - AI should not cause harm
- Autonomy - AI should respect human agency and decision-making
- Justice - AI should be fair and non-discriminatory
- Explicability - AI decisions should be explainable and transparent
Get more than a passing familiarity with the underlying technology and main paradigms.
-
Introduction to AI and ML
-
Types of ML and AI systems
- Supervised, Unsupervised, Semi-supervised, and Reinforcement Learning
- Self-supervised, Online, Transfer Learning
- Basic Algorithms
- How AI, ML and Deep Learning are related
-
Neural Networks
-
Machine Learning Algorithms
Understanding all stages of the AI & ML Development Lifecycle is critical for auditors. The lifecycle can be thought of in 4 phases:
- Phase 1: Before ML - check if non-ML solutions can solve the problem
- Phase 2: Simple ML models (logistic regression, gradient-boosted trees, k-NN)
- Phase 3: Optimizing simple models (hyperparameter search, feature engineering, ensembles)
- Phase 4: Complex models if simpler approaches don't meet requirements
Dedicated page on data quality
Key aspects to evaluate:
- Data quality - accuracy, completeness, consistency
- Relevance - alignment with problem scope
- Contextual appropriateness - time, location, scenario representation
- Bias and variety - representation across groups
- Provenance - sourcing, documentation, trustworthiness
- Evaluating Data Quality
- A 2024 Survey of ETL tools
- Data cleaning, processing, improvement
- Treatment of outliers
- Normalization and scaling
Synthetic data is evolving fast with interesting use cases.
Opportunities:
- Addressing data deficits and representation concerns
- Privacy protection and bias reduction
- Economic efficiency vs real-world data collection
- Compliance requirements
Risks:
- Data quality issues leading to unreliable models
- Reverse engineering risks (hence differential privacy)
- Data pollution/contamination
- Model collapse from synthetic data
- Bias propagation
References:
- Opportunities and Risks of Synthetic Data
- How to Validate Synthetic Data Quality
- Ethical Challenges of Using Synthetic Data
- Model Selection - choosing appropriate algorithms based on problem nature
- Hyperparameter Tuning - grid search, random search, Bayesian optimization
- Cross-Validation - evaluating model generalization
- Performance Metrics
Reinforcement Learning from Human Feedback incorporates human input to enhance AI model training. Key approaches:
- Binary/Scalar Feedback
- Comparative RLHF
- Proximal Policy Optimization (PPO)
- Direct Preference Optimization (DPO)
- Constitutional AI
- RLAIF (AI Feedback)
- Metrics to evaluate ML algorithms
- Accuracy of training data and model outputs in Generative AI
- Reliability in Machine Learning
An undeployed model is worthless, and an unmonitored one is a risk.
Key concerns:
- Conceptual drift - data distribution shifts over time
- Quality drift - production data differs from training data
- Infrastructure monitoring - SLAs, failures, latencies, scalability
References:
- Monitoring Checklist: 7 Things to Track
- Checklist for AI Deployment
- Deployment and Monitoring Overview
Dedicated transparency page | Algorithmic transparency page
Techniques to make AI models interpretable and decisions understandable:
-
Explainable AI (XAI)
-
Interpretation Methods
-
Algorithmic Transparency - EU Framework, UK Standard
-
Platform Observability - beyond algorithmic transparency to sociotechnical observability
Dedicated bias page | Types of AI bias
-
Fairness Principles
-
Bias Detection and Mitigation
- OpenDP Framework
- differentialprivacy.org
- A friendly, non-technical introduction
- TensorFlow Privacy library
- Opacus library for PyTorch
-
Threat Modeling
-
Attack Types - Evasion, Poisoning, Model Extraction, Membership Inference, Prompt Injection, Jailbreaking, and more
-
Defense Strategies
-
Secure AI Development
-
Data Pipeline Security
- AI Maintenance: A Robustness Perspective
- Model Performance Evaluation - precision, recall, F1-score, AUC-ROC, cross-validation techniques
- Error Analysis - Confusion Matrix, Feature Importance, Error Patterns
- Debugging ML Models - book
- AI Alignment - making AI systems do what humans want without unintended side effects
- Risk Assessment - identifying and mitigating AI risks
- Fail-safe Mechanisms - graceful degradation strategies
- GenAI Safety - NIST GenAI Risk Management Profile
- Long-term Safety - AGI considerations (Roman V. Yampolskiy's work)
As AI systems become more autonomous, new safety considerations emerge:
- Agent autonomy and oversight
- Tool use and function calling risks
- Multi-agent coordination
- Sandboxing and containment
- Human-in-the-loop requirements
Evaluating capabilities and risks of frontier AI models:
- Capability Evaluations - dangerous capability assessments
- Uplift Studies - measuring capability gains from AI assistance
- Automated Red Teaming - AI testing AI
- Pre-deployment Testing - safety assessments before release
AI/ML Red Teaming identifies vulnerabilities and weaknesses before exploitation:
- Red Teaming Language Models with Language Models
- Learning diverse attacks on LLMs
- Red Teaming LLMs: Methods, Scaling, Lessons
- SocioTechnical Approach to Red Teaming
Vendor Tools:
- EU AI Act - risk-based categorization with corresponding requirements
- NIST AI Risk Management Framework - standard PDF
- GDPR and CCPA for data protection
- OECD AI Principles
Sector-Specific Regulations:
- FINRA AI/ML Guidelines (Financial)
- EEOC AI in Hiring (HR)
- Healthcare AI regulations
- AI Policies - organizational AI usage policies
- Accountability structures - roles and responsibilities
- AIGA Framework
- Defining Organizational AI Governance
- Toward AI Governance Best Practices
- Putting AI Ethics into Practice: The Hourglass Model
- OECD AI Policy Observatory
- EU AI HLEG
- IEEE Ethically Aligned Design
- IEEE 7000-2021 Standard - Addressing Ethical Concerns in System Design
ISO Standards:
- ISO/IEC 42001:2023 - AI Management System
- ISO/IEC 23894:2023 - AI Risk Management
- ISO/IEC TR 24027:2021 - Bias in AI Systems
- ISO/IEC TS 12791 - Treatment of Unwanted Bias
- ISO/IEC DIS 42006 - Requirements for AI Audit Bodies (forthcoming)
Key topics:
- Understanding sustainability concerns around AI/ML models
- Tools and techniques to measure environmental impact
- Carbon footprint of training and inference
- Green AI practices
Comprehensive auditing frameworks must:
- Consider multiple dimensions: governance, strategy, performance, monitoring, review
- Cover both technical aspects and ethical considerations
- Adhere to evolving standards (ISO/IEC 42001:2023, EU AI Act, etc.)
- Evaluate monitoring metrics and remediation procedures
- Include the entire AI lifecycle
- Account for stakeholder interests and ethical metrics
Key Challenges:
- Absence of standardized frameworks
- Rapidly evolving field requiring continuous learning
- AI system complexity and black-box nature
- Different regulatory requirements across jurisdictions
- Skills gap in the industry
- Difficulty validating massive training datasets
References:
AI assurance provides confidence that an AI system is designed, developed, and deployed responsibly. Key aspects:
- Independent evaluation
- Criteria-based assessment
- Transparency
- Accountability
- Defining audit objectives and scope
- Developing audit criteria and checklists
- Risk identification and assessment
- Data sampling and analysis - examining training and test data for bias, quality, representativeness
- Data lineage and provenance - integrity verification
- Model evaluation and testing - LIME, SHAP, adversarial testing, stress testing
- Source code and architecture review - security vulnerabilities
- The Right Tool for the Job: Open-Source Auditing Tools in ML
- Hands-on experience with selected tools
- Model Cards - HuggingFace guide, landscape analysis
- Datasheets - Datasheets for Datasets
- Data Statements Guide for NLP
- Data Cards Playbook
- Version Control for AI Models
- Audit of Dataset Licensing
- Metrics for measuring AI progress
- Principles for Evaluation of AI/ML Performance
- Measuring AI Beyond Accuracy
- Loss Functions and Metrics in Deep Learning
- Benchmarking and Comparative Analysis
- Monitoring AI Systems
- Basic Python for data analysis and model inspection
- Libraries: AI Fairness 360, SHAP, LIME
- Understanding different roles: data scientist, AI product owner
AI ethics literature has converged on 5 core principles: transparency, justice and fairness, non-maleficence, responsibility, and privacy.
- Ability to critically evaluate AI-generated outputs
- Healthy skepticism towards AI insights
- Navigating ethical dilemmas in auditing
References:
- The Ethics of AI Business Practices
- An Overview of Artificial Intelligence Ethics
- Ethics-Based Auditing to Develop Trustworthy AI
- The Ethics of AI Ethics: An Evaluation of Guidelines
- Explaining technical concepts to non-technical audiences
- Communicating with stakeholders of varying AI literacy
- Negotiation and conflict resolution in audit scenarios
- Sector-specific knowledge
Practical Python implementations of key Trustworthy AI techniques are available in the code/ folder:
| Topic | File | Libraries |
|---|---|---|
| Bias Testing | bias_testing.py | AIF360, Fairlearn |
| Explainability | explainability.py | SHAP, LIME |
| Adversarial Testing | adversarial_testing.py | ART, PyRIT |
| Evaluation Frameworks | eval_frameworks.py | Inspect AI, Custom |
| Differential Privacy | differential_privacy.py | Opacus, TensorFlow Privacy |
Each file contains verbose explanations of the underlying concepts, practical runnable examples, and best practices for production use.
- Self-Assessment list for Trustworthy AI (ALTAI)
- Microsoft Responsible AI Standard v2
- Data Ethics Canvas
- AI Ethics Policy Template
- AI Ethics Toolkit
- NOREA Guiding Principles
- UK ICO AI Audit Guide
- AI Incident Database
- TrustLLM Toolkit
- AuditNLG (Salesforce)
- HRIA Guidance and Template
- EDPB AI Auditing Checklist
- NIST AI RMF
- Microsoft PyRIT
- Model Cards and Datasheets Collection
- EU Aequitas Project
- SEI MLTE
- LatticeFlow AI Assessments
- AI Auditing Tools: Best 6 Solutions
- Popular Software Tools for AI Auditability
- Top 25 AI Governance Tools
- LAMARR: AI for Auditing
- Fiddler Auditor
- AI Security Tools: Open-Source Toolkit
- ISACA Policy Template Library
Training:
- ISACA AI Resources
- IIA Auditing AI Course
- IIA Essentials for AI Auditing
- Babl AI Courses
- Coursera Responsible GenAI
- MIT AI Strategy and Leadership
Certifications:
- IAPP AIGP
- ISO/IEC 42001 Lead Auditor
- UL Certified AI Professional
- EITCA AI Academy
- ForHumanity Certifications
- Trustworthy AI Papers Collection
- Debugging ML Models with Python
- Towards a Business Case for AI Ethics
- Responsible AI and ESG
- CEPS AI Ethics Task Force Report
- 2024 AI Assurance Technology Market Report
- Code & Conduct: Third Party AI Auditing
- Practicing Trustworthy ML (O'Reilly)
- A Blueprint for Auditing Generative AI