Parse and analyze Portuguese SAF-T tax files with AI assistants
Getting Started · Available Tools · Configuration
Versao em Portugues · 13 tools · 152 tests
A Model Context Protocol (MCP) server that enables AI assistants like Claude, Cursor, and Windsurf to load, validate, and analyze Portuguese SAF-T (Standard Audit File for Tax Purposes) XML files. Load a SAF-T file and immediately query invoices, get revenue summaries, VAT breakdowns, and validate compliance with Portuguese tax rules.
SAF-T PT is a mandatory XML file that all Portuguese companies must be able to export from their accounting/billing software. It contains the company's invoices, payments, customers, products, tax entries, and more. This MCP server turns that XML into a queryable data source for AI assistants.
- Python 3.11+ and uv (recommended) or pip
- A SAF-T PT XML file exported from any Portuguese billing/accounting software (PHC, Sage, Primavera, etc.)
pip install saft-mcpOr from source:
git clone https://github.com/bybloom-ai/saft-mcp.git
cd saft-mcp
uv syncClaude Code
claude mcp add saft-mcp -- /path/to/saft-mcp/.venv/bin/python -m saft_mcpClaude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"saft-mcp": {
"command": "/path/to/saft-mcp/.venv/bin/python",
"args": ["-m", "saft_mcp"]
}
}
}Cursor / VS Code / Other MCP clients
Add to your MCP client configuration:
{
"mcpServers": {
"saft-mcp": {
"command": "/path/to/saft-mcp/.venv/bin/python",
"args": ["-m", "saft_mcp"]
}
}
}Ask your AI assistant:
"Load my SAF-T file at ~/Documents/saft_2025.xml and give me a revenue summary"
The server will parse the file, extract all invoices and tax data, and make it available for querying through natural conversation.
Load and parse a SAF-T PT XML file. This must be called first before using any other tool.
| Parameter | Type | Description |
|---|---|---|
file_path |
string | Path to the SAF-T XML file |
Returns company name, NIF, fiscal period, SAF-T version, and record counts (customers, products, invoices, payments).
Handles Windows-1252 and UTF-8 encodings, BOM stripping, and automatic namespace detection. Files under 50 MB are parsed with full DOM; larger files use streaming.
Validate the loaded file against the official XSD schema and Portuguese business rules.
| Parameter | Type | Default | Description |
|---|---|---|---|
rules |
list[string] | all | Specific rules to check |
Available rules:
| Rule | What it checks |
|---|---|
xsd |
XML structure against SAF-T PT 1.04_01 XSD schema |
numbering |
Sequential invoice numbering within each series |
nif |
NIF (tax ID) mod-11 check digit validation |
tax_codes |
Tax percentages match known Portuguese VAT rates |
atcud |
ATCUD unique document codes are present and well-formed |
hash_chain |
Hash continuity across invoice sequences |
control_totals |
Calculated totals match declared control totals |
Returns results with severity (error/warning), location, and fix suggestions.
Generate an executive summary of the loaded file. No parameters needed.
Returns:
- Revenue totals (gross, credit notes, net)
- Invoice and credit note counts
- VAT breakdown by rate
- Top 10 customers by revenue
- Document type distribution (FT, FR, NC, ND, FS)
Search and filter invoices with full pagination.
| Parameter | Type | Default | Description |
|---|---|---|---|
date_from |
string | - | Start date (YYYY-MM-DD) |
date_to |
string | - | End date (YYYY-MM-DD) |
customer_nif |
string | - | Filter by tax ID (partial match) |
customer_name |
string | - | Filter by name (case-insensitive, partial) |
doc_type |
string | - | FT, FR, NC, ND, or FS |
min_amount |
number | - | Minimum gross total |
max_amount |
number | - | Maximum gross total |
status |
string | - | N (normal), A (cancelled), F (invoiced) |
limit |
integer | 50 | Results per page (max 500) |
offset |
integer | 0 | Pagination offset |
Returns matching invoices with document number, date, type, customer, amounts, status, and line count.
Generate a VAT analysis grouped by rate, month, or document type.
| Parameter | Type | Default | Description |
|---|---|---|---|
date_from |
string | - | Start date (YYYY-MM-DD) |
date_to |
string | - | End date (YYYY-MM-DD) |
group_by |
string | rate |
Group by rate, month, or doc_type |
Returns taxable base, VAT amount, and gross total per group, plus overall totals.
Search and filter customer master data with revenue enrichment.
| Parameter | Type | Default | Description |
|---|---|---|---|
name |
string | - | Company name (case-insensitive, partial) |
nif |
string | - | Tax ID (partial match) |
city |
string | - | Billing city (case-insensitive, partial) |
country |
string | - | Country code (exact, e.g. "PT", "ES") |
limit |
integer | 50 | Results per page (max 500) |
offset |
integer | 0 | Pagination offset |
Returns customers with invoice count and total revenue per customer.
Search and filter the product catalog with sales statistics.
| Parameter | Type | Default | Description |
|---|---|---|---|
description |
string | - | Product description (case-insensitive, partial) |
code |
string | - | Product code (partial match) |
product_type |
string | - | P (product), S (service), O (other), I (import), E (export) |
group |
string | - | Product group (case-insensitive, partial) |
limit |
integer | 50 | Results per page (max 500) |
offset |
integer | 0 | Pagination offset |
Returns products with times sold, total quantity, and total revenue.
Get full detail for a single invoice including all line items.
| Parameter | Type | Description |
|---|---|---|
invoice_no |
string | Exact invoice number (e.g. "FR 2025A15/90") |
Returns complete invoice with header, document totals, special regimes, and all lines with product, quantity, price, tax, exemptions, and references.
Detect suspicious patterns and irregularities in the loaded file.
| Parameter | Type | Default | Description |
|---|---|---|---|
checks |
list[string] | all | Specific checks to run |
Available checks:
| Check | What it detects |
|---|---|
duplicate_invoices |
Same customer + amount + date combinations |
numbering_gaps |
Missing sequential numbers within each series |
weekend_invoices |
Invoices issued on Saturdays or Sundays |
unusual_amounts |
Invoice amounts > 3 standard deviations from the mean |
cancelled_ratio |
High cancellation rates per series |
zero_amount |
Invoices with zero gross total |
Returns anomalies with type, severity, description, and affected documents.
Compare the loaded SAF-T file against a second file (e.g. month-over-month, year-over-year).
| Parameter | Type | Default | Description |
|---|---|---|---|
file_path |
string | - | Path to the second SAF-T XML file |
metrics |
list[string] | all | Metrics to compare |
Available metrics: revenue, customers, products, doc_types, vat.
Returns period labels and a changes dict with before/after/delta per metric. Includes top new/lost customers, top movers, and percentage changes.
Compute accounts receivable aging from invoices and payments.
| Parameter | Type | Default | Description |
|---|---|---|---|
reference_date |
string | today | Date to age from (YYYY-MM-DD) |
buckets |
list[int] | [30,60,90,120] | Aging bucket boundaries in days |
Returns per-customer aging with amounts in each bucket, sorted by total outstanding. Uses FIFO allocation of payments against invoices.
Export data to CSV files for use in spreadsheets or other tools.
| Parameter | Type | Default | Description |
|---|---|---|---|
export_type |
string | - | invoices, customers, products, tax_summary, or anomalies |
file_path |
string | - | Output CSV file path |
filters |
dict | - | Optional filters (same as corresponding query tool) |
Returns file path, row count, and column names.
Generate a statistical overview of invoicing data.
| Parameter | Type | Default | Description |
|---|---|---|---|
date_from |
string | - | Start date (YYYY-MM-DD) |
date_to |
string | - | End date (YYYY-MM-DD) |
Returns invoice statistics (mean, median, std deviation), daily/weekly/monthly distributions, customer concentration (Pareto analysis), and top/bottom invoices.
1. saft_load -> Parse the XML file
2. saft_validate -> Check compliance (XSD + business rules)
3. saft_summary -> Get the big picture (revenue, top customers, VAT)
4. saft_query_invoices -> Drill into specific invoices
5. saft_get_invoice -> Full detail for a single invoice
6. saft_tax_summary -> VAT analysis by rate, month, or doc type
7. saft_anomaly_detect -> Flag suspicious patterns
8. saft_stats -> Statistical distributions and trends
9. saft_compare -> Diff against another SAF-T file
10. saft_export -> Export results to CSV
Example questions you can ask after loading a file:
- "How much revenue did the company make this year?"
- "Show me all credit notes above 500 euros"
- "What's the monthly VAT breakdown?"
- "Are there any validation errors in this file?"
- "List invoices for customer XPTO in Q3"
- "What percentage of revenue comes from the top 5 customers?"
- "Are there any suspicious patterns or anomalies?"
- "Compare this file against last month's SAF-T"
- "What's the accounts receivable aging?"
- "Export all invoices to CSV"
All settings are configurable via environment variables with the SAFT_MCP_ prefix:
| Variable | Default | Description |
|---|---|---|
SAFT_MCP_STREAMING_THRESHOLD_BYTES |
52428800 (50 MB) | Files above this use streaming parser |
SAFT_MCP_MAX_FILE_SIZE_BYTES |
524288000 (500 MB) | Maximum file size accepted |
SAFT_MCP_SESSION_TIMEOUT_SECONDS |
1800 (30 min) | Session expiry after inactivity |
SAFT_MCP_MAX_CONCURRENT_SESSIONS |
5 | Maximum simultaneous loaded files |
SAFT_MCP_DEFAULT_QUERY_LIMIT |
50 | Default results per page |
SAFT_MCP_MAX_QUERY_LIMIT |
500 | Maximum results per page |
SAFT_MCP_LOG_LEVEL |
INFO | Logging level |
AI Assistant (Claude, Cursor, etc.)
|
| MCP Protocol (stdio)
v
+------------------------------------------+
| saft-mcp server |
| |
| server.py FastMCP entry point |
| state.py Session management |
| |
| parser/ |
| detector.py Namespace detection |
| encoding.py Charset handling |
| full_parser.py DOM parse (< 50 MB) |
| models.py Pydantic data models |
| |
| tools/ |
| load.py saft_load |
| validate.py saft_validate |
| summary.py saft_summary |
| query_invoices.py saft_query_invoices|
| query_customers.py saft_query_customer|
| query_products.py saft_query_products|
| get_invoice.py saft_get_invoice |
| tax_summary.py saft_tax_summary |
| anomaly_detect.py saft_anomaly_detect|
| compare.py saft_compare |
| aging.py saft_aging |
| export.py saft_export |
| stats.py saft_stats |
| |
| validators/ |
| xsd_validator.py XSD 1.04_01 |
| business_rules.py Numbering, totals |
| nif.py NIF mod-11 |
| hash_chain.py Hash continuity |
| |
| schemas/ |
| saftpt1.04_01.xsd Official XSD |
+------------------------------------------+
Key design decisions:
- All monetary values use
Decimalto avoid floating-point rounding in tax calculations - lxml for XML parsing, with automatic XSD 1.1 feature stripping (the official Portuguese XSD uses
xs:assertandxs:allwith unbounded children, which lxml's XSD 1.0 engine cannot handle natively) - Pydantic v2 models validated against real PHC Corporate exports
- Namespace auto-detection by scanning the first 4 KB of the file (never hardcoded)
- Windows-1252 encoding handled natively via the XML declaration
# Install with dev dependencies
uv sync --extra dev
# Run tests (152 tests)
pytest
# Lint
ruff check src/ tests/
# Format
ruff format src/ tests/
# Type check
mypy src/saft-mcp/
src/saft_mcp/ # Source code
server.py # FastMCP entry point, tool registration
config.py # Settings (pydantic-settings, env vars)
state.py # Session store, parsed file state
exceptions.py # SaftError hierarchy
parser/ # XML parsing (encoding, detection, models)
tools/ # One file per MCP tool
validators/ # XSD, business rules, NIF, hash chain
schemas/ # Official XSD file
tests/ # Mirrors src/ structure
pyproject.toml # Project config (hatch build, ruff, mypy, pytest)
-
saft_query_customers-- search and filter customer master data -
saft_query_products-- search and filter product catalog -
saft_get_invoice-- full invoice detail with line items -
saft_anomaly_detect-- flag duplicate invoices, numbering gaps, unusual amounts -
saft_compare-- diff two SAF-T files (e.g. month-over-month) -
saft_aging-- accounts receivable aging analysis -
saft_export-- export data to CSV -
saft_stats-- statistical overview and distributions - Streaming parser for large files (>= 50 MB)
- Accounting SAF-T support (journal entries, general ledger, trial balance)
-
saft_trial_balance-- generate trial balance from accounting data -
saft_ies_prepare-- pre-fill IES annual tax return fields -
saft_cross_check-- cross-reference invoicing vs accounting SAF-T - PyPI package (
pip install saft-mcp) - GitHub Actions CI (pytest + ruff + mypy)
- SAF-T PT 1.04_01 (current Portuguese standard)
Tested with real exports from PHC Corporate. Should work with SAF-T files from any compliant Portuguese software (Sage, Primavera, PHC, Moloni, InvoiceXpress, etc.).
MIT
Built by bybloom.ai, a business unit of Bloomidea