# Security and Governance

<b>Share to:</b>

<a href="https://www.linkedin.com/sharing/share-offsite/?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/8/81/LinkedIn_icon.svg" alt="LinkedIn logo">
</a>

<a href="https://twitter.com/intent/tweet?url=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/5a/X_icon_2.svg" alt="X logo">
</a>

<a href="https://www.facebook.com/sharer/sharer.php?u=https%3A//github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/batch-prediction/intro_batch_prediction.ipynb" target="_blank">
  <img width="20px" src="https://upload.wikimedia.org/wikipedia/commons/5/51/Facebook_f_logo_%282019%29.svg" alt="Facebook logo">
</a>            

| | |
|-|-|
|Author(s) | [Keeyana Jones](https://github.com/keeyanajones)|

## I. Identity and Access Management (IAM) in MLOps

IAM is the cornerstone of security in any cloud environment, and MLOps is no exception. It defines who can do what, where, and when. In MLOps, this complexity increases due to the diverse roles (data scientists, ML engineers, data engineers, ops), the different types of resources (data, models, compute, pipelines), and the various stages of the ML lifecycle.

Key Principles:

- Principle of Least Privilege (PoLP): Grant users and service accounts only the permissions absolutely necessary to perform their tasks, and nothing more. This minimizes the blast radius in case of a breach.

- Segregation of Duties (SoD): Separate critical functions so that no single individual has control over an entire process. For example, the person who trains a model shouldn't necessarily be the only one who can deploy it to production.

GCP IAM Best Practices for MLOps:

- Custom Roles over Predefined Roles: While GCP offers many predefined roles (e.g., roles/aiplatform.admin, roles/bigquery.dataViewer), these often grant more permissions than needed. Create custom roles with granular permissions tailored to specific tasks (e.g., a "Vertex AI Model Deployer" role with aiplatform.models.deploy permission but no training permissions).

   Example for Vertex AI:
   - Data Scientist: aiplatform.user (or a custom role allowing training, experiment tracking, notebook access)
   - ML Engineer: Custom role with permissions to create/update Vertex AI Pipelines, manage Model Registry, deploy endpoints (aiplatform.endpoints.deploy, aiplatform.models.upload).

Data Engineer: bigquery.dataEditor, storage.objectAdmin for data buckets, dataflow.admin (if using Dataflow).

   - Read-only Access: A "Vertex AI Model Viewer" role for business users to monitor deployed models or view experiment results.

### Service Accounts for Automation:

   - Dedicated Service Accounts: Create separate service accounts for different automated processes (e.g., one for training pipelines, one for deployment pipelines, one for model monitoring jobs).

- Workload Identity (GKE/Cloud Run): If running ML workloads on GKE or Cloud Run, use Workload Identity to securely bind Kubernetes service accounts to GCP service accounts, removing the need to manage secret keys.
- Vertex AI Service Agent: Vertex AI uses a Google-managed service account (the "Vertex AI Service Agent") to interact with other GCP services on your behalf (e.g., reading data from Cloud Storage for training). Ensure this service agent has the necessary permissions on your resources.

Resource Hierarchy and IAM Policies:

   - Organize your GCP resources using Folders and Projects to align with your organizational structure and apply IAM policies at appropriate levels. Policies at the folder level inherit to projects and resources within them.

   - Project-level vs. Resource-level Access: Granting access at the project level is easier for broad access, but resource-level access (e.g., allowing a service account to only predict on a specific Vertex AI endpoint) is more granular for sensitive production systems.

Regular Audits and Review:

   - Periodically review IAM policies to ensure they still adhere to PoLP and that no unnecessary permissions have accumulated. Use GCP's Policy Analyzer and Recommender services.

Leverage Cloud Audit Logs to track who accessed what and when, providing a crucial trail for security investigations.

   - Strong Authentication: Enforce Multi-Factor Authentication (MFA) for all user accounts, especially those with administrative privileges.


## II. Data Residency in MLOps

Data residency refers to the physical location where data is stored and processed. This is a critical concern due to legal and regulatory requirements (e.g., GDPR, HIPAA, country-specific data sovereignty laws).

Challenges in MLOps:

- Global Infrastructure vs. Local Laws: Cloud providers like GCP have global infrastructure, but many regulations require data to stay within specific geographical boundaries (e.g., EU data must remain in the EU).
- Training Data vs. Inference Data: Data used for training models (often large historical datasets) and data used for real-time inference (live user data) both fall under residency rules.
- Model Artifacts: The trained models themselves, and their metadata, can contain sensitive information or be derived from sensitive data, thus potentially falling under data residency requirements.
- Generative AI Nuances: With foundation models like Gemini, understanding where your prompts and outputs are processed and stored is crucial.

GCP Solutions and Best Practices:

   - Regional Selection:
       - Choose the Right Region: When provisioning GCP resources (Vertex AI, Cloud Storage, BigQuery, Compute Engine, GKE), always select the region(s) that meet your data residency requirements.
       - Multi-Region vs. Single Region: Understand the difference. Multi-region (e.g., europe-west) means data is replicated across multiple data centers within that multi-region, providing higher availability but still adhering to the broader geographical boundary.
   - Vertex AI and Data Residency:
       - Managed Services: Vertex AI services (Training, Prediction Endpoints, Feature Store, Pipelines) are regional. When you create resources in a specific region, Google commits to processing your data within that region.
       - Generative AI on Vertex AI (Gemini): Google provides commitments for data residency during ML processing for Gemini models in specific regions (e.g., Canada, Europe, US). This means your prompts and the model's processing for inference will stay within your chosen region.

- Data Caching for Gemini: By default, Gemini models on Vertex AI may cache inputs for up to 24 hours to reduce latency. For strict data residency, you must explicitly disable this caching at the project level. Google provides API calls to manage this.
- Customer-Managed Encryption Keys (CMEK): For enhanced control, use CMEK to encrypt your data in Vertex AI, Cloud Storage, and BigQuery. You manage the encryption keys in Cloud Key Management Service (Cloud KMS), providing an extra layer of control over your data's security.

- Data Storage (Cloud Storage, BigQuery): Configure buckets and datasets to be in the required regions. Implement lifecycle policies to manage data retention, ensuring data doesn't linger unnecessarily.
- Hybrid Cloud/Edge Solutions: For extremely strict low-latency or on-premises processing requirements, consider hybrid solutions where some ML inference or data pre-processing occurs on-premises or at the edge, while model training might happen in the cloud.

- Documentation: Clearly document your data flows, storage locations, and processing regions for all ML workloads to demonstrate compliance.


## III. Compliance in MLOps

Compliance involves adhering to relevant laws, regulations, industry standards, and internal policies. For MLOps, this extends beyond general data privacy to AI-specific regulations.

Key Compliance Areas:

   - Data Privacy Regulations:
    - GDPR (Europe), CCPA/CPRA (California), HIPAA (Healthcare, US), LGPD (Brazil), PIPL (China): These regulations govern how personal data is collected, stored, processed, and used.
    - Impact on ML: Requires anonymization/pseudonymization of data, consent management, data minimization, right to erasure, and data portability.
    - GCP Tools:
       - Sensitive Data Protection: Detects and redacts/masks PII/PHI in data before it's used for training or processed by LLMs.

Cloud Data Loss Prevention (DLP): Automates the identification and protection of sensitive data across your GCP environment.

- IAM & Audit Logs: Ensure access control and traceability for sensitive data.

AI-Specific Regulations:

- EU AI Act: A landmark regulation that categorizes AI systems by risk level and imposes stringent requirements for high-risk AI (e.g., conformity assessments, risk management systems, human oversight, transparency, robustness).
- NIST AI Risk Management Framework: A voluntary framework providing guidance on managing AI risks, including bias, privacy, and security.
- Ethical AI Principles: Increasingly, organizations are adopting formal ethical AI principles that become internal compliance standards.

Industry-Specific Certifications:

- ISO 27001 (Information Security Management), ISO 27017 (Cloud Security), ISO 27018 (Cloud Privacy), SOC 1/2/3: - These certifications demonstrate adherence to best practices for security and privacy.
- HIPAA (Health Insurance Portability and Accountability Act): For healthcare data. GCP has a BAA (Business Associate Addendum) with customers to support HIPAA compliance.

GCP Support for Compliance:

-  Google Cloud Compliance Offerings: Google Cloud regularly undergoes independent verification of its security, privacy, and compliance controls. Check the Google Cloud Compliance page for a comprehensive list of certifications (ISO, SOC, HIPAA, FedRAMP, etc.).

- Assured Workloads: For highly regulated industries, Assured Workloads helps meet compliance requirements by enforcing specific controls on data location, personnel access, and support processes.
- VPC Service Controls: Creates security perimeters around your sensitive data and services, dramatically reducing the risk of data exfiltration. Essential for high-risk ML workloads.
- Access Transparency: Provides visibility into Google administrative access to your data, offering logs of actions taken by Google staff.
- Data Governance & Generative AI on Vertex AI: As mentioned in the previous discussion, Google commits to not using your data to train foundation models without explicit permission. You can disable caching for Gemini inputs to achieve zero data retention for prompts.

- Building a Compliance Framework for MLOps:

    - Risk Assessment: Identify potential risks (e.g., bias, data leakage, PII exposure, model misuse) at each stage of your ML pipeline.
    - Policy Definition: Translate external regulations and internal ethical principles into concrete internal policies for data handling, model development, deployment, and monitoring.
    - Tooling & Automation: Implement GCP services and MLOps tools to enforce policies (e.g., IAM, DLP, Vertex AI Model Monitoring).
    - Auditing & Reporting: Maintain detailed audit trails (Cloud Audit Logs, Vertex AI Metadata) and generate reports to demonstrate compliance to auditors.

- Training & Awareness: Ensure all personnel involved in MLOps (data scientists, engineers, product managers) are trained on relevant security, privacy, and compliance requirements.

    - Continuous Monitoring: Compliance is not a one-time event. Continuously monitor your systems, processes, and model behaviors to ensure ongoing adherence.

By proactively integrating IAM, data residency considerations, and a robust compliance framework into your MLOps strategy, you can build and deploy AI systems responsibly, mitigate risks, and gain the trust of your users and stakeholders.