Skip to content

Configuration Reference

Flip Forensics edited this page Mar 31, 2026 · 3 revisions

Configuration Reference

AIFT uses a layered configuration system. Settings are resolved in this order (highest priority first):

  1. Environment variables — API keys from shell environment
  2. config.yaml — User overrides in the project root
  3. Hardcoded defaults — Built-in values in app/config.py

If config.yaml does not exist on startup, AIFT creates one from the defaults automatically.


ai Section

Controls which AI provider is active and per-provider connection settings.

Key Type Default Description
ai.provider string "claude" Active AI provider. One of: claude, openai, kimi, local

ai.claude

Key Type Default Description
ai.claude.api_key string "" Anthropic API key. Can also be set via ANTHROPIC_API_KEY env var
ai.claude.model string "claude-opus-4-6" Claude model identifier
ai.claude.attach_csv_as_file bool true Send CSV data as a file attachment rather than inline text
ai.claude.request_timeout_seconds int 600 HTTP request timeout in seconds

ai.openai

Key Type Default Description
ai.openai.api_key string "" OpenAI API key. Can also be set via OPENAI_API_KEY env var
ai.openai.model string "gpt-5.2" OpenAI model identifier
ai.openai.attach_csv_as_file bool true Send CSV data as a file attachment rather than inline text
ai.openai.request_timeout_seconds int 600 HTTP request timeout in seconds

ai.kimi

Key Type Default Description
ai.kimi.api_key string "" Moonshot/Kimi API key. Can also be set via MOONSHOT_API_KEY or KIMI_API_KEY env var
ai.kimi.model string "kimi-k2-turbo-preview" Kimi model identifier
ai.kimi.base_url string "https://api.moonshot.ai/v1" Kimi API base URL
ai.kimi.attach_csv_as_file bool true Send CSV data as a file attachment rather than inline text
ai.kimi.request_timeout_seconds int 600 HTTP request timeout in seconds

ai.local

For OpenAI-compatible local endpoints (Ollama, LM Studio, vLLM, etc.).

Key Type Default Description
ai.local.base_url string "http://localhost:11434/v1" Local model API endpoint
ai.local.model string "llama3.1:70b" Model name as recognized by the local server
ai.local.api_key string "not-needed" API key (most local servers ignore this)
ai.local.attach_csv_as_file bool true Send CSV data as a file attachment rather than inline text
ai.local.request_timeout_seconds int 3600 HTTP request timeout in seconds (higher default for local models)

server Section

Key Type Default Description
server.port int 5000 TCP port the Flask server listens on (1–65535)
server.host string "127.0.0.1" Bind address. Use 127.0.0.1 for localhost only

evidence Section

Key Type Default Description
evidence.large_file_threshold_mb int 0 Maximum evidence upload size in MB. Set to 0 for unlimited (recommended for large forensic images). Files exceeding this limit are rejected with a suggestion to use path mode
evidence.csv_output_dir string "" Optional directory for parsed CSV output. Empty = store inside the case directory
evidence.intake_timeout_seconds int 7200 Frontend timeout for evidence intake requests (hashing + validation). Large images (100 GB+) may need 30+ minutes. Default is 2 hours
evidence.compute_hashes bool true Compute SHA-256 and MD5 hashes during evidence intake. Disable to speed up intake for large images; the report will show SKIPPED instead of PASS/FAIL for hash verification

analysis Section

Key Type Default Description
analysis.ai_max_tokens int 128000 Maximum token budget for the AI prompt (system + user + data)
analysis.shortened_prompt_cutoff_tokens int 64000 When ai_max_tokens is below this value, a compact prompt template (without statistics) is used to save tokens
analysis.connection_test_max_tokens int 256 Max tokens for the "Test Connection" health check call
analysis.date_buffer_days int 7 Days to add before/after investigation dates when filtering artifact records
analysis.citation_spot_check_limit int 20 Maximum number of citations (timestamps, row refs, column names) to validate per artifact
analysis.artifact_deduplication_enabled bool true Remove duplicate rows from artifact CSVs before sending to the AI
analysis.artifact_ai_columns_config_path string "config/artifact_ai_columns.yaml" Path to the artifact column filtering config file

Note: max_merge_rounds is not in the hardcoded defaults but can be set in config.yaml or via the UI advanced settings. It controls the maximum number of hierarchical merge iterations when chunked analysis produces multiple findings. The UI defaults to 5.


Environment Variable Mapping

Environment variables override config.yaml values. Only API keys are mapped:

Environment Variable Config Key Notes
ANTHROPIC_API_KEY ai.claude.api_key Anthropic/Claude API key
OPENAI_API_KEY ai.openai.api_key OpenAI API key
MOONSHOT_API_KEY ai.kimi.api_key Moonshot/Kimi API key (checked first)
KIMI_API_KEY ai.kimi.api_key Alias for MOONSHOT_API_KEY (checked second)

If an environment variable is set and non-empty, it always wins over the config.yaml value.


UI Settings Modal vs Config-Only

The settings gear icon in the web UI exposes a Basic tab and an Advanced tab.

Basic Tab (UI)

  • AI provider selection (ai.provider)
  • Per-provider fields: API key, model, base URL (where applicable)
  • Server port (server.port)
  • CSV output directory (evidence.csv_output_dir)

Advanced Tab (UI)

  • analysis.ai_max_tokens
  • analysis.shortened_prompt_cutoff_tokens
  • analysis.connection_test_max_tokens
  • analysis.date_buffer_days
  • analysis.citation_spot_check_limit
  • analysis.artifact_deduplication_enabled
  • analysis.max_merge_rounds
  • evidence.large_file_threshold_mb (displayed in GB as "Evidence Size Threshold")
  • evidence.intake_timeout_seconds
  • ai.local.request_timeout_seconds
  • ai.<provider>.attach_csv_as_file (per-provider toggles for Claude, OpenAI, Kimi, Local)

Config-Only (not in UI)

  • server.host — must be edited in config.yaml
  • analysis.artifact_ai_columns_config_path — must be edited in config.yaml

Artifact AI Columns Configuration

File: config/artifact_ai_columns.yaml

This file controls which CSV columns are sent to the AI for each artifact type. By default, all columns from a parsed artifact CSV are included. When an artifact has an entry in this file, only the listed columns are sent — reducing token usage and focusing the AI on forensically relevant data.

Structure

artifact_ai_columns:
  <artifact_key>:
    - column_name_1
    - column_name_2
    - ...

Example

artifact_ai_columns:
  runkeys:
    - ts
    - name
    - command
    - username
  shimcache:
    - last_modified
    - name
    - path

Currently Configured Artifacts

Windows: runkeys, shimcache, sam, services, shellbags, amcache, browser.downloads, browser.history, prefetch, mft, usnjrnl, powershell_history, bam, sru.application, tasks, userassist, muicache, recyclebin, evtx.

Linux: bash_history, zsh_history, fish_history, python_history, wtmp, btmp, lastlog, users, groups, sudoers, network.interfaces, syslog, journalctl, packagemanager, ssh.authorized_keys, ssh.known_hosts, cronjobs, services.

How to Edit

  • To include all columns for an artifact: remove its entry from the file.
  • To restrict columns: list only the column names you want the AI to see.
  • Column names must match the CSV headers exactly (as produced by the Dissect parser).
  • Changes take effect on the next analysis run — no restart required.

Clone this wiki locally