<a href="https://colab.research.google.com/github/TOM-BOHN/SFDC-User-Permissions-AI/blob/main/Notebooks/SFDC_User_Permission_AI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

Install the Python SDK.

In [1]:
!git clone https://github.com/TOM-BOHN/SFDC-User-Permissions-AI.git

Cloning into 'SFDC-User-Permissions-AI'...
remote: Enumerating objects: 177, done.[K
remote: Counting objects: 100% (177/177), done.[K
remote: Compressing objects: 100% (130/130), done.[K
Receiving objects: 100% (177/177), 69.63 KiB | 1.29 MiB/s, done.
remote: Total 177 (delta 81), reused 116 (delta 41), pack-reused 0 (from 0)[K
Resolving deltas: 100% (81/81), done.


In [2]:
!pip install -Uq "google-genai==1.7.0"

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/144.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m143.4/144.7 kB[0m [31m39.9 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m3.3 MB/s[0m eta [36m0:00:00[0m
[?25h

In [3]:
from google import genai
from google.genai import types

from IPython.display import Markdown, display

genai.__version__

###################################

import sys
import os
import time

import enum
import json

import pandas as pd

###################################

os.chdir('/content/SFDC-User-Permissions-AI')

###################################

# Import the processing functions
from src.processing import extract_json_fields
from src.utils.data_utils import save_data
from src.llms import RiskRating, eval_summary, create_chat_session, classify_risk_rating

### Set up your API key

To run the following cell, your API key must be stored it in a [Kaggle secret](https://www.kaggle.com/discussions/product-feedback/114053) named `GOOGLE_API_KEY`.

If you don't already have an API key, you can grab one from [AI Studio](https://aistudio.google.com/app/apikey). You can find [detailed instructions in the docs](https://ai.google.dev/gemini-api/docs/api-key).

To make the key available through Kaggle secrets, choose `Secrets` from the `Add-ons` menu and follow the instructions to add your key or enable it for this notebook.

In [4]:
from google.colab import userdata

client = genai.Client(api_key=userdata.get('GOOGLE_API_KEY'))

### Automated retry

This codelab sends a lot of requests, so set up an automatic retry
that ensures your requests are retried when per-minute quota is reached.

In [5]:
from google.api_core import retry

is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})

if not hasattr(genai.models.Models.generate_content, '__wrapped__'):
  genai.models.Models.generate_content = retry.Retry(
      predicate=is_retriable)(genai.models.Models.generate_content)

In [6]:
url = "https://raw.githubusercontent.com/TOM-BOHN/SFDC-User-Permissions-AI/refs/heads/main/data/input/User_Permission_Reference_Data__Sample.csv"
perm_list_df = pd.read_csv(url)
perm_list_df.head()

Unnamed: 0,Permission Name,API Name,Description
0,Access Data Cloud Data Explorer,AccessCdpDataExplorer,Allows user access Data Cloud Data Explorer.
1,Administer territory operations,ManageTerritories,Prerequisite user permission for a user to man...
2,Allow sending of List Emails,ListEmailSend,"Allow users to create, edit and send List Emails"
3,Api Only User,ApiUserOnly,Access Salesforce.com only through a Salesforc...
4,Author Apex,AuthorApex,Create Apex classes and triggers.


In [7]:
with open('/content/SFDC-User-Permissions-AI/src/prompts/templates/prompt_user_perm_risk_rating.md', 'r') as f:
    PROMPT_USER_PERM_RISK_RATING = f.read()

print(PROMPT_USER_PERM_RISK_RATING)

# Permission Risk Evaluation Prompt Template  
# --------------------------------------------------
# This template can be imported and formatted with the specific
# `permission_name` and `permission_api_name` and `permission_description` variables to create
# a concrete evaluation prompt for any Salesforce permission.
# --------------------------------------------------

# Instruction
You are a **Salesforce security risk assessor**.
Your task is to evaluate the **inherent risk level** of a Salesforce permission (or capability) when granted to a user.
We will provide you with the permission name and a short description of what it allows.
Analyze the permission against the **Evaluation Criteria** below and assign one of the five **Risk Levels** defined in the Rating Rubric.
Give step‑by‑step reasoning for your decision, citing the specific criteria that most influenced your rating.

# Evaluation

## Metric Definition
**Permission Risk** [aka weighted_score] measures the potential negati

In [8]:
chat_session = create_chat_session(client = client, model_name='gemini-2.0-flash')

# Evaluate a single permission
text_eval, struct_eval = eval_summary(
    prompt=PROMPT_USER_PERM_RISK_RATING,
    name=perm_list_df['Permission Name'][0],
    api_name=perm_list_df['API Name'][0],
    description=perm_list_df['Description'][0],
    model_name='gemini-2.0-flash',
    client=client,
    chat_session=chat_session  # Reuse the same session
)

# Display the result
Markdown(text_eval)
print(f"Risk Rating: {struct_eval.name} ({struct_eval.value})")

Risk Rating: CONTROLLED (2)


In [16]:
results_df = classify_risk_rating(
      input_df = perm_list_df
    , prompt = PROMPT_USER_PERM_RISK_RATING
    , chat_session = chat_session
    , total_records = 2
    , checkin_interval = 60
    , debug = True
  )
results_df

Starting job to process 2 records.
####################

Analyzing Permission 1 of 2...
Name:        Access Data Cloud Data Explorer
API Name:    AccessCdpDataExplorer
Description: Allows user access Data Cloud Data Explorer.
--------------------
Risk Rating: RiskRating.CONTROLLED
####################

Analyzing Permission 2 of 2...
Name:        Administer territory operations
API Name:    ManageTerritories
Description: Prerequisite user permission for a user to manage a territory branch.
--------------------
Risk Rating: RiskRating.SENSITIVE
####################


####################
Total time taken: 4.93 seconds to process 2 records.
Average time per record: 2.46 seconds

Sample Output of Results:
                   Permission Name               API Name  \
0  Access Data Cloud Data Explorer  AccessCdpDataExplorer   
1  Administer territory operations      ManageTerritories   

                                         Description            Risk Rating  \
0       Allows user access

Unnamed: 0,Permission Name,API Name,Description,Risk Rating,Evaluation,Processing Time
0,Access Data Cloud Data Explorer,AccessCdpDataExplorer,Allows user access Data Cloud Data Explorer.,RiskRating.CONTROLLED,"```json\n{\n ""risk_tier"": ""Controlled"",\n ""r...",2.423239
1,Administer territory operations,ManageTerritories,Prerequisite user permission for a user to man...,RiskRating.SENSITIVE,"```json\n{\n ""risk_tier"": ""Sensitive"",\n ""ri...",2.500572


In [20]:
full_results_df = extract_json_fields(
    results_df
  , json_column='Evaluation'
  , debug = True
)


First 5 rows of processed data:


Unnamed: 0,Permission Name,API Name,Description,Risk Rating,Evaluation,Processing Time,Risk Tier,Weighted Score,Scores,Rationale,Confidence
0,Access Data Cloud Data Explorer,AccessCdpDataExplorer,Allows user access Data Cloud Data Explorer.,2,"{ ""risk_tier"": ""Controlled"", ""risk_rating"": ...",2.423239,Controlled,2.1,"{'Data_Sensitivity': 3, 'Scope_of_Impact': 2, ...",Access to Data Cloud Data Explorer allows view...,High
1,Administer territory operations,ManageTerritories,Prerequisite user permission for a user to man...,3,"{ ""risk_tier"": ""Sensitive"", ""risk_rating"": ""...",2.500572,Sensitive,2.6,"{'Data_Sensitivity': 3, 'Scope_of_Impact': 3, ...",The ability to administer territory operations...,High



Columns added: ['Risk Tier', 'Risk Rating', 'Weighted Score', 'Scores', 'Rationale', 'Confidence']


In [18]:
results_df['Evaluation'][1]

'```json\n{\n  "risk_tier": "Sensitive",\n  "risk_rating": "3",\n  "weighted_score": 2.6,\n  "scores": {\n    "Data_Sensitivity": 3,\n    "Scope_of_Impact": 3,\n    "Configurational_Authority": 3,\n    "External_Data_Exposure": 1,\n    "Regulatory_Obligation": 2,\n    "Segregation_of_Duties": 2,\n    "Auditability": 3,\n    "Reversibility": 2\n  },\n  "rationale": "The ability to administer territory operations involves managing sales territories, potentially impacting sales data and team assignments. Data sensitivity is a concern (score of 3) because territories often contain customer and revenue information. The scope of impact is moderate (score of 3) due to the potential to affect multiple users and opportunities. As such, this permission falls into the \'Sensitive\' category, requiring careful oversight and monitoring.",\n  "confidence": "High"\n}\n```'

In [19]:
full_results_df.iloc[0].to_dict()

{'Permission Name': 'Access Data Cloud Data Explorer',
 'API Name': 'AccessCdpDataExplorer',
 'Description': 'Allows user access Data Cloud Data Explorer.',
 'Risk Rating': '2',
 'Evaluation': '{  "risk_tier": "Controlled",  "risk_rating": "2",  "weighted_score": 2.1,  "scores": {    "Data_Sensitivity": 3,    "Scope_of_Impact": 2,    "Configurational_Authority": 1,    "External_Data_Exposure": 2,    "Regulatory_Obligation": 2,    "Segregation_of_Duties": 2,    "Auditability": 2,    "Reversibility": 1  },  "rationale": "Access to Data Cloud Data Explorer allows viewing data that might be sensitive, leading to a Data_Sensitivity score of 3. While primarily for exploration, the potential for extracting or misusing the exposed data exists (External_Data_Exposure = 2). Given these factors, a \'Controlled\' risk tier is appropriate, requiring monitoring to prevent unauthorized data handling.",  "confidence": "High"}',
 'Processing Time': 2.42323899269104,
 'Risk Tier': 'Controlled',
 'Weight

In [13]:
# Save the results DataFrame
save_data(
    data=full_results_df,
    filename='results',
    data_type='output',  # This will save to data/output/
    format='csv',
    index=False
)



'data/output/results.csv'