Skip to content

BlueFalconHD/apple_generative_model_safety_decrypted

Repository files navigation

apple_generative_model_safety_decrypted

Decrypted Generative Model safety files for Apple Intelligence containing filters

If you are curious about my process of figuring this stuff out, take a peek inside HOW.md

Structure

  • decrypted_overrides/: Contains decrypted overrides for various models.
    • com.apple.*/: Directory named using the Asset Specifier assosciated with the safety info
      • Info.plist: Contains metadata for the override
      • AssetData/: Contains the decrypted JSON files
  • combined_metadata/: Contains combined and deduplicated metadata files for convenient review
    • global_metadata.json: Combined global safety filters from all models
    • region_*.json: Combined region-specific safety filters (e.g., region_CN_metadata.json)
    • locale_*.json: Combined locale-specific safety filters (e.g., locale_en_US_metadata.json)
  • get_key_lldb.py: Script to get the encryption key (see usage info below)
  • decrypt_overrides.py: Script to decrypt the overrides (see usage info below)
  • combine_metadata.py: Script to combine and deduplicate metadata files by region/locale

Usage

Python dependencies

cryptography is the only dependency required to run the decryption script. You can install it using pip:

pip install cryptography

Getting the encryption key

To retrieve the encryption key (generated by ModelCatalog.Obfuscation.readObfuscatedContents) for the overrides, you must attach LLDB to GenerativeExperiencesSafetyInferenceProvider ( /System/Library/ExtensionKit/Extensions/GenerativeExperiencesSafetyInferenceProvider.appex/Contents/MacOS/GenerativeExperiencesSafetyInferenceProvider). Also it is important that this is Xcode's LLDB, not the default macOS one or LLVM's lldb. The method I recommend to get LLDB to attach:

  • Run sudo killall GenerativeExperiencesSafetyInferenceProvider; sudo xcrun lldb -w -n GenerativeExperiencesSafetyInferenceProvider /System/Library/ExtensionKit/Extensions/GenerativeExperiencesSafetyInferenceProvider.appex/Contents/MacOS/GenerativeExperiencesSafetyInferenceProvider
  • In the Shortcuts app, create a dummy shortcut that uses the Generative Model action ("Use Model") and select the On-Device option. Type whatever you want into the text field, it doesn't matter. Then run the shortcut.
  • You should see LLDB attach to (the newly started instance of) GenerativeExperiencesSafetyInferenceProvider with a message like this:
(lldb) process attach --name "GenerativeExperiencesSafetyInferenceProvider" --waitfor
Process 53629 stopped
* thread #1, stop reason = signal SIGSTOP
    frame #0: 0x00000001839f41f8 dyld`dyld4::PrebuiltLoader::dependent(dyld4::RuntimeState const&, unsigned int, mach_o::LinkedDylibAttributes*) const + 116
dyld`dyld4::PrebuiltLoader::dependent:
->  0x1839f41f8 <+116>: add    x0, sp, #0xe
    0x1839f41fc <+120>: mov    x1, x19
    0x1839f4200 <+124>: bl     0x1839e50dc    ; dyld4::Loader::LoaderRef::loader(dyld4::RuntimeState const&) const
    0x1839f4204 <+128>: ldrh   w8, [x20, #0x4]
Target 0: (GenerativeExperiencesSafetyInferenceProvider) stopped.
Executable binary set to "/System/Library/ExtensionKit/Extensions/GenerativeExperiencesSafetyInferenceProvider.appex/Contents/MacOS/GenerativeExperiencesSafetyInferenceProvider".
Architecture set to: arm64e-apple-macosx-.
  • In this repository's root, run the command in LLDB: command script import get_key_lldb.py
  • Then run c to continue the process. LLDB will print the encryption key to the console and save it to ./key.bin.

Decrypting the overrides

To decrypt the overrides, run the following command in the root of this repository:

python decrypt_overrides.py /System/Library/AssetsV2/com_apple_MobileAsset_UAF_FM_Overrides/purpose_auto \
  -k key.bin \
  -o decrypted_overrides

The decrypted_overrides directory will be created if it does not exist, and the decrypted overrides will be placed in it. This is only necessary if the overrides have been updated, there is already a decrypted version of the overrides in this repository that is up to date as of June 28, 2025.

Combining metadata files

After decrypting the overrides, you can run the combine_metadata.py script to generate combined and deduplicated metadata files:

python3 combine_metadata.py

This script will:

  • Process all metadata.json files in the decrypted_overrides directory
  • Combine them by region/locale and create a global combined file
  • Deduplicate all entries to provide clean, consolidated lists
  • Save the results to the combined_metadata/ directory

The combined metadata files provide the most convenient way to review all safety filters, as they eliminate duplicate entries and group related filters together. These files are particularly useful for:

  • Understanding what content is filtered globally vs. by region/locale
  • Analyzing the scope and nature of safety filters across different contexts
  • Reviewing the complete set of safety rules without having to examine hundreds of individual files

Understanding the overrides

The overrides are JSON files that contain safety filters for various generative models. Each override is associated with a specific model context (from what I can tell) and contains rules that determine how the model should behave in certain situations, such as filtering out harmful content or ensuring compliance with safety standards.

Here is an example of one of the overrides metadata.json file sourced from dec_out_repo/decrypted_overrides/com.apple.gm.safety_deny.output.code_intelligence.base. Note the output part of the specifier, which indicates that this is a safety override for model output rather than user input:

{
  "reject": [
    "xylophone copious opportunity defined elephant 10out",
    "xylophone copious opportunity defined elephant out"
  ],
  "remove": [],
  "replace": {},
  "regexReject": [
    "(?i)\\bbitch\\b",
    "(?i)\\bdago\\b",
    "(?i)\\bdyke\\b",
    "(?i)\\bhebe\\b",
    ...
  ],
  "regexRemove": [],
  "regexReplace": {}
}

Here, the reject field contains exact phrases which will result in a guardrail violation. The remove field contains phrases that will be removed from the output, while the replace field contains phrases that will be replaced with other phrases. The regexReject, regexRemove, and regexReplace fields contain regular expressions that will be used to match and filter content in a similar manner.

About

Decrypted Generative Model safety files for Apple Intelligence containing filters

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages