Skip to content

CybercentreCanada/assemblyline-service-document-preview

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Document preview service

This repository is a self-developed Assemblyline service based on a FAME's module. It was created by x1mus with support from Sorakurai and reynas at NVISO.

This also contains modified source code from the following repositories:

OCR Configuration

In this service, you're allowed to override the default OCR terms from the service base using ocr key in the config block of the service manifest.

Simple Term Override (Legacy)

Let's say, I want to use a custom set of terms for ransomware detection. Then I can set the following:

config:
    ocr:
        ransomware: ['bad1', 'bad2', ...]

This will cause the service to only use the terms I've specified when looking for ransomware terms. This is still subject to the hit threshold defined in the service base.

Advanced Term Override

Let's say, I want to use a custom set of terms for ransomware detection and I want to set the hit threshold to 1 instead of 2 (default). Then I can set the following:

config:
    ocr:
        ransomware:
            terms: ['bad1', 'bad2', ...]
            threshold: 1

This will cause the service to only use the terms I've specified when looking for ransomware terms and is subject to the hit threshold I've defined.

Term Inclusion/Exclusion

Let's say, I want to add/remove a set of terms from the default set for ransomware detection. Then I can set the following:

config:
    ocr:
        ransomware:
            include: ['bad1', 'bad2', ...]
            exclude: ['bank account']

This will cause the service to add the terms listed in include and remove the terms in exclude when looking for ransomware terms in OCR detection with the default set.