# Loading models from disk

In this notebook, we will load the models from disk instead of pulling from HuggingFace. This is helpful when you want to deploy LLM Guard on a server and share the models with other instances.

## Pull models from HuggingFace

First, we will pull the models from [HuggingFace and save them to disk](https://huggingface.co/docs/hub/en/models-downloading). You can also pull them from other sources and save them to disk.

In [None]:
!git lfs install
!git clone git@hf.co:protectai/deberta-v3-base-prompt-injection-v2
!git clone git@hf.co:MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33
!git clone git@hf.co:tomaarsen/span-marker-bert-base-orgs
!git clone git@hf.co:unitary/unbiased-toxic-roberta
!git clone git@hf.co:philomath-1209/programming-language-identification
!git clone git@hf.co:madhurjindal/autonlp-Gibberish-Detector-492513457
!git clone git@hf.co:papluca/xlm-roberta-base-language-detection
!git clone git@hf.co:Isotonic/deberta-v3-base_finetuned_ai4privacy_v2

**Note**: If you use only `ONNX` models, you can remove the other versions of the models to save disk space.

## Use local models in LLM Guard

Now, we will use the local models in LLM Guard.

In [None]:
!pip install llm_guard@git+https://github.com/protectai/llm-guard.git

In [11]:
from llm_guard import scan_prompt
from llm_guard.input_scanners import (
    Anonymize,
    BanCompetitors,
    BanTopics,
    Code,
    Gibberish,
    Language,
    PromptInjection,
    Toxicity,
)
from llm_guard.input_scanners.anonymize_helpers import DEBERTA_AI4PRIVACY_v2_CONF
from llm_guard.input_scanners.ban_competitors import MODEL_BASE as BAN_COMPETITORS_MODEL
from llm_guard.input_scanners.ban_topics import MODEL_DEBERTA_BASE_V2 as BAN_TOPICS_MODEL
from llm_guard.input_scanners.code import DEFAULT_MODEL as CODE_MODEL
from llm_guard.input_scanners.gibberish import DEFAULT_MODEL as GIBBERISH_MODEL
from llm_guard.input_scanners.language import DEFAULT_MODEL as LANGUAGE_MODEL
from llm_guard.input_scanners.prompt_injection import V2_MODEL as PROMPT_INJECTION_MODEL
from llm_guard.input_scanners.toxicity import DEFAULT_MODEL as TOXICITY_MODEL
from llm_guard.vault import Vault

PROMPT_INJECTION_MODEL.kwargs["local_files_only"] = True
PROMPT_INJECTION_MODEL.path = "./deberta-v3-base-prompt-injection-v2"

DEBERTA_AI4PRIVACY_v2_CONF["DEFAULT_MODEL"].path = "./deberta-v3-base_finetuned_ai4privacy_v2"
DEBERTA_AI4PRIVACY_v2_CONF["DEFAULT_MODEL"].kwargs["local_files_only"] = True

BAN_TOPICS_MODEL.path = "./deberta-v3-base-zeroshot-v1.1-all-33"
BAN_TOPICS_MODEL.kwargs["local_files_only"] = True

TOXICITY_MODEL.path = "./unbiased-toxic-roberta"
TOXICITY_MODEL.kwargs["local_files_only"] = True

BAN_COMPETITORS_MODEL.path = "./span-marker-bert-base-orgs"
BAN_COMPETITORS_MODEL.kwargs["local_files_only"] = True

CODE_MODEL.path = "./programming-language-identification"
CODE_MODEL.kwargs["local_files_only"] = True

GIBBERISH_MODEL.path = "./autonlp-Gibberish-Detector-492513457"
GIBBERISH_MODEL.kwargs["local_files_only"] = True

LANGUAGE_MODEL.path = "./xlm-roberta-base-language-detection"
LANGUAGE_MODEL.kwargs["local_files_only"] = True

vault = Vault()
input_scanners = [
    Anonymize(vault, recognizer_conf=DEBERTA_AI4PRIVACY_v2_CONF),
    BanTopics(["politics", "religion"], model=BAN_TOPICS_MODEL),
    BanCompetitors(["google", "facebook"], model=BAN_COMPETITORS_MODEL),
    Toxicity(model=TOXICITY_MODEL),
    Code(["Python", "PHP"], model=CODE_MODEL),
    Gibberish(model=GIBBERISH_MODEL),
    Language(["en"], model=LANGUAGE_MODEL),
    PromptInjection(model=PROMPT_INJECTION_MODEL),
]

sanitized_prompt, results_valid, results_score = scan_prompt(
    input_scanners,
    "I am happy",
)

print(sanitized_prompt)
print(results_valid)
print(results_score)

[2m2024-03-21 12:39:44[0m [[32m[1mdebug    [0m] [1mNo entity types provided, using default[0m [36mdefault_entities[0m=[35m['CREDIT_CARD', 'CRYPTO', 'EMAIL_ADDRESS', 'IBAN_CODE', 'IP_ADDRESS', 'PERSON', 'PHONE_NUMBER', 'US_SSN', 'US_BANK_NUMBER', 'CREDIT_CARD_RE', 'UUID', 'EMAIL_ADDRESS_RE', 'US_SSN_RE'][0m
[2m2024-03-21 12:39:46[0m [[32m[1mdebug    [0m] [1mInitialized NER model         [0m [36mdevice[0m=[35mdevice(type='mps')[0m [36mmodel[0m=[35mModel(path='./deberta-v3-base_finetuned_ai4privacy_v2', subfolder='', onnx_path='Isotonic/deberta-v3-base_finetuned_ai4privacy_v2', onnx_subfolder='onnx', onnx_filename='model.onnx', kwargs={'local_files_only': True}, pipeline_kwargs={'aggregation_strategy': 'simple', 'ignore_labels': ['O', 'CARDINAL']})[0m
[2m2024-03-21 12:39:47[0m [[32m[1mdebug    [0m] [1mLoaded regex pattern          [0m [36mgroup_name[0m=[35mCREDIT_CARD_RE[0m
[2m2024-03-21 12:39:47[0m [[32m[1mdebug    [0m] [1mLoaded regex pattern  

Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


[2m2024-03-21 12:40:04[0m [[32m[1mdebug    [0m] [1mPrompt does not have sensitive data to replace[0m [36mrisk_score[0m=[35m0.0[0m
[2m2024-03-21 12:40:04[0m [[32m[1mdebug    [0m] [1mScanner completed             [0m [36melapsed_time_seconds[0m=[35m1.366613[0m [36mis_valid[0m=[35mTrue[0m [36mscanner[0m=[35mAnonymize[0m


Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


[2m2024-03-21 12:40:05[0m [[32m[1mdebug    [0m] [1mNo banned topics detected     [0m [36mscores[0m=[35m{'religion': 0.5899404287338257, 'politics': 0.4100596308708191}[0m
[2m2024-03-21 12:40:05[0m [[32m[1mdebug    [0m] [1mScanner completed             [0m [36melapsed_time_seconds[0m=[35m0.911[0m [36mis_valid[0m=[35mTrue[0m [36mscanner[0m=[35mBanTopics[0m
[2m2024-03-21 12:40:05[0m [[32m[1mdebug    [0m] [1mNone of the competitors were detected[0m
[2m2024-03-21 12:40:05[0m [[32m[1mdebug    [0m] [1mScanner completed             [0m [36melapsed_time_seconds[0m=[35m0.569812[0m [36mis_valid[0m=[35mTrue[0m [36mscanner[0m=[35mBanCompetitors[0m
[2m2024-03-21 12:40:06[0m [[32m[1mdebug    [0m] [1mNot toxicity found in the text[0m [36mresults[0m=[35m[[{'label': 'toxicity', 'score': 0.0003712967736646533}, {'label': 'male', 'score': 0.00016587311984039843}, {'label': 'female', 'score': 0.00012892877566628158}, {'label': 'insult', 'sco