Skip to content

elleskay/scamshield

Repository files navigation

ScamShield: check, report, stay safe. An unofficial rebuild of Singapore's ScamShield.

CI  spec gate  platforms  stack  license

An unofficial portfolio reconstruction of Singapore's ScamShield. Paste a suspicious message, phone number, or email and get an instant verdict with the signals behind it; report a scam and watch a reviewer verify it. A real vertical slice, not a mockup: a React Native app, a NestJS API on AWS serverless, native call and SMS screening, and a spec-driven gate that refuses to pass unless every requirement is proven.

Not affiliated with the official ScamShield, Open Government Products, GovTech, or the Singapore Police Force. The name refers only to the product this replica is modeled on. Educational use only. Do not report real scams here.

Try it

App preview https://elleskay.github.io/scamshield/ (real React Native components via react-native-web, light theme, against the live API)
Admin dashboard https://elleskay.github.io/scamshield/admin/ (needs the admin token)
Live API health https://14cet1wgg0.execute-api.ap-southeast-1.amazonaws.com/health
Built on the mobile-platform template

The product this reconstructs

ScamShield is an Open Government Products app, launched in 2020, that lets Singapore citizens check suspicious calls, messages, links, and emails and report scams to the authorities, with on-device and AI-assisted detection, community reporting, and automatic blocking of numbers verified by the Singapore Police Force. Per OGP's public product report it has handled on the order of 453,000 community checks and reports in a single quarter, blocked over 14,000 new scam numbers, and detects new scam variants in about five days, working with the SPF, GovTech, HTX, and financial institutions.

This repo rebuilds that product's core end to end: the check-a-message, number, or email verdict with the signals behind it, community reporting with a human-in-the-loop verify step, scam alerts and live stats, native call and SMS screening fed by a synced blocklist, and an admin verification dashboard. The aim was to reproduce the real stack and engineering rigor (React Native, NestJS, AWS serverless, SQS, classification, push, IaC, a spec-driven gate), not the partner integrations that need real government systems.

The impact figures above describe the real OGP product. This reconstruction is an independent, unofficial build for learning and portfolio purposes.

Demo

Every screen is the real app running against the live AWS API (light theme by default, full dark variant), captured on an Android emulator, not a mockup. The headline screens (a check, an explained scam verdict with its signals and risk meter, and a verified caller) are shown in the banner above.

Verdicts and their signals, verified-sender and verified-caller labels, "reported N times" clustering, the do-not-tap link warning, report status, stats, and alerts all come from the deployed backend. A report stays Under review until an admin verifies it.

My reports, scam alerts, and a report submitted confirmation

The reviewer side closes the loop. An admin dashboard lists reports, verifies each as scam, suspicious, or clean (which notifies the reporter), offers search and CSV export, and uploads scam numbers to the blocklist the app syncs for native call screening.

Admin verification dashboard

A live check against the deployed API:

$ curl -s https://14cet1wgg0.execute-api.ap-southeast-1.amazonaws.com/health
{"status":"ok","service":"mobile-platform-api","time":"2026-..."}

$ curl -s -X POST .../reports/check \
    -H 'content-type: application/json' \
    -d '{"text":"URGENT: your bank account is locked, verify at http://dbs-secure.link"}'
{"verdict":"scam","score":0.92,
 "reason":"contains a link and urgency language",
 "signals":["link present","urgency lure"]}

The Android build runs end to end under Maestro on an emulator in CI, and ships as a signed release APK with the call-screening service compiled in (see docs/MOBILE.md).

Why it exists

To show the stack and engineering practices of the real ScamShield (TypeScript and React, NestJS, PostgreSQL, AWS, IaC, CI/CD, SSDLC, SQS, OpenSearch, ML/LLM classification, push) end to end. The point is not feature count, it is rigor: every requirement is specified, tested at the layer that can actually prove it, and backed by a real run rather than a screenshot.

What it does

  • Check a message, phone number, or email and get a verdict (clean, suspicious, spam, or scam) with a score and the signals behind it.
  • Trusted senders. Messages from registered Sender IDs (CPF, IRAS, MOM, MAS, and others) read as channel-verified; genuine OTP messages are not flagged; a scam number embedded in a message body escalates the whole message.
  • Report a scam and get a queued receipt; it appears under My Reports as Under review until a human verifies it.
  • Verify (admin). A reviewer lists, searches, and CSV-exports reports, marks each scam, suspicious, or clean, and uploads scam numbers to the blocklist.
  • Notify. A push fires to the reporter when an admin confirms their report a scam, never on the classifier's guess alone.
  • Native screening. A synced blocklist feeds an Android CallScreeningService and the iOS Call Directory and Message Filter extensions.
  • Alerts and stats. A scam-advisory feed and live usage counters, with "reported N times" clustering of similar reports.

Architecture

User flow

What a person actually does, and where each tap goes. A check is synchronous and returns a verdict immediately. Reporting is fire-and-forget: it returns a receipt right away and the heavy work happens off the request path.

flowchart TD
    A([Open app]) --> B{What to check}
    B -->|Message| C["POST /reports/check"]
    B -->|Email| C
    B -->|Phone number| D["POST /numbers/check"]
    C --> H[Classifier: verdict, score, signals]
    D --> H
    H --> I{Verdict}
    I -->|clean / verified| J[Green card with trusted-sender or verified-caller badge]
    I -->|suspicious / spam| K[Amber card with signals]
    I -->|scam| L[Red card with signals and do-not-tap warning]
    J --> M{Report it}
    K --> M
    L --> M
    M -->|No| Z([Done])
    M -->|Yes| N["POST /reports returns a queued receipt"]
    N --> O["Appears under My Reports as Under review"]
    O --> P[Admin verifies in the dashboard]
    P --> Q[Status updates and push on confirmed scam]
Loading

Check and report sequence

The full lifecycle of a report, including the split HTTP-plus-SQS topology and the human-in-the-loop verify step. The push fires on the admin verdict, not on the classifier suggestion.

sequenceDiagram
    actor U as User (app)
    participant GW as API Gateway
    participant H as HTTP Lambda (NestJS)
    participant Q as SQS reports queue
    participant W as Worker Lambda
    participant DB as Postgres (scamshield)
    participant CL as Classifier
    actor A as Admin (dashboard)
    participant EX as Expo Push

    U->>GW: POST /reports/check {text, sender}
    GW->>H: invoke
    H->>CL: classify(text, sender)
    CL-->>H: verdict, score, signals
    H-->>U: verdict card (synchronous)

    U->>GW: POST /reports {text, deviceToken}
    GW->>H: invoke
    H->>DB: addReport(status = pending)
    H->>Q: SendMessage {reportId, text}
    H-->>U: {reportId, status: queued}

    Q->>W: deliver batch (up to 10)
    W->>DB: markProcessedIfNew(reportId)
    Note over W,DB: idempotent dedup on reportId
    W->>CL: classify(text)
    CL-->>W: suggested verdict
    W->>DB: setSuggestion(reportId, verdict)

    A->>GW: GET /admin/reports (Bearer token)
    GW->>H: invoke
    H->>DB: listAll()
    H-->>A: reports with suggestions
    A->>GW: PATCH /admin/reports/:id {verdict: scam}
    GW->>H: invoke
    H->>DB: verify(reportId, scam), set reviewed_at
    H->>EX: notifyScam(deviceToken, reportId)
    EX-->>U: push "a scam you reported was confirmed"
    H-->>A: updated summary
Loading

Logical architecture

The pieces and how they relate, independent of where they run, drawn as a single tree so no lines cross. The app works offline against a local heuristic that mirrors the server classifier. The API hides its storage behind one ReportsStore boundary with two interchangeable backends. The LLM classifier and OpenSearch clustering are optional and degrade to a deterministic default when unconfigured.

flowchart TD
    ROOT[ScamShield system]

    ROOT --> APP[Mobile app: Expo / React Native]
    ROOT --> ADMIN[Admin web: React / Vite]
    ROOT --> API[NestJS API]

    APP --> A1[Check / Reports / Alerts tabs]
    APP --> A2[Offline heuristic classifier]
    APP --> A3[Blocklist sync]
    A3 --> A4["Native: Android CallScreeningService, iOS Call Directory and Message Filter"]

    ADMIN --> B1[List, verify, search, CSV export, blocklist upload]

    API --> C1["Reports: /reports/check, /reports"]
    API --> C2["Numbers: /numbers/check, /numbers/blocklist"]
    API --> C3["Alerts: /alerts"]
    API --> C4["Stats: /stats"]
    API --> C5["Admin: /admin/* behind AdminGuard"]
    API --> C6[Classifier service]
    API --> C7[Push service]
    API --> C8[ReportsStore boundary]

    C6 --> D1[Deterministic heuristic default]
    C6 --> D2[External LLM classifier, optional]
    C8 --> D3[(Postgres: scamshield schema)]
    C8 --> D4[(In-memory fallback)]
    C8 --> D5[(OpenSearch clustering, optional)]
Loading

Physical architecture (AWS)

Where it runs, drawn top-down so the request path flows straight through. Two Lambdas come off one codebase: an HTTP handler behind API Gateway, and an SQS-triggered worker. Data lives in Neon's managed Postgres over its pooled endpoint (the pool is capped per Lambda for serverless safety). OpenSearch is optional and only used for clustering. Memory, timeout, and retention values below are the ones set in the CDK construct.

flowchart TD
    APP[iOS / Android app] --> GW[API Gateway v2 HTTP API]
    WEB[Admin browser on GitHub Pages] --> GW

    GW --> HL["HTTP Lambda lambda.handler<br/>Node 20 ARM64, 512 MB, 29 s"]

    HL --> PG[("Neon Postgres pooled endpoint<br/>scamshield schema")]
    HL --> EXPO[Expo Push API]
    HL --> Q[["SQS reports queue, visibility 60 s"]]

    Q --> WL["Worker Lambda worker.handler<br/>Node 20 ARM64, 1024 MB, 60 s"]
    Q --> DLQ[["Dead-letter queue<br/>maxReceive 5, retain 14 d"]]

    WL --> PG
    WL --> OS[("OpenSearch 2.13 t3.small, optional")]
Loading

Deployment (CI/CD)

Every push runs the full CI matrix and the spec gate. Merges to main deploy the API over OIDC (no long-lived AWS keys) and finish with a live smoke test. The app ships through EAS, either a full store build or a JS-only OTA update.

flowchart TD
    DEV[Developer push or PR] --> GH[GitHub Actions]

    GH --> CI[ci.yml: every PR]
    GH --> SEC[security.yml]
    GH --> DEP[deploy-api.yml: on main]
    GH --> EAS[mobile-build.yml]

    CI --> CI1[typecheck, lint, actionlint, expo-doctor]
    CI --> CI2[NestJS build and cdk synth]
    CI --> CI3[jest-expo and vitest]
    CI --> CI4[Maestro Android and Playwright admin e2e]
    CI --> CI5[spec coverage gate]

    SEC --> SEC1[CodeQL]
    SEC --> SEC2[gitleaks]
    SEC --> SEC3[npm audit]

    DEP --> DEP1[OIDC assume role]
    DEP1 --> DEP2[db migrate]
    DEP2 --> DEP3["cdk deploy --all"]
    DEP3 --> DEP4[live smoke test]

    EAS --> EAS1[eas build all platforms to the stores]
    EAS --> EAS2[eas update OTA, JS and assets only]
Loading

Database design

Four tables in an isolated scamshield schema on the shared Neon database. The design is deliberately small: processed_reports is an idempotency ledger so a redelivered SQS message is a no-op, and counters keeps cheap aggregates so /stats does not scan the reports table on the hot path. Snippets are stored redacted; full message text is never persisted.

erDiagram
    reports {
        text report_id PK "uuid"
        text device_token "indexed, opaque per-device id"
        text channel "message | email | number"
        text snippet "redacted preview, NOT NULL"
        text suggested_verdict "classifier suggestion"
        text status "pending | scam | suspicious | spam | clean"
        text cluster_key "indexed, domain or normalized text"
        text created_at "ISO timestamp"
        text reviewed_at "set on admin verify"
    }
    processed_reports {
        text report_id PK "idempotency ledger"
    }
    counters {
        text name PK "checks, reports, confirmedScams"
        bigint value "default 0"
    }
    blocklist {
        text number PK "scam phone number"
    }
    reports ||..o| processed_reports : "deduped by report_id"
Loading

Indexes: idx_reports_device on device_token (My Reports lookups) and idx_reports_cluster on cluster_key (the "reported N times" count). All DDL is created idempotently at boot by the PostgresStore, so the schema self-provisions on first connect. When DATABASE_URL is unset (CI and local), the same ReportsStore interface is served from memory instead.

Spec-driven development

The build is driven by a YAML spec, not by vibes. The first artifact for any feature is a requirement in specs/scamshield.yml; code and its test ship in the same change; and a coverage gate refuses to pass unless every requirement has a passing proof at the layer that can actually demonstrate it.

specs/scamshield.yml holds 44 requirements across 21 domains (check, classify, report, call, email, admin, alerts, stats, clustering, native screening, push, security, and more). Each entry is a small contract:

- id: SCAM-CHECK-001
  title: User can check a suspicious message and see a verdict
  category: functional
  verify: e2e
  platforms: [ios, android]
  severity: high
  given: the user is on the Check screen
  when: they paste a message with a link and a lure word and tap Check
  then: a SCAM or SUSPICIOUS verdict is shown with a short reason
  tags: [check]

The verify level decides what counts as proof. A data requirement proven only by an e2e test is a category mismatch and fails the gate; things a JS test cannot honestly prove (OS-level call blocking, a no-secrets bundle scan) are satisfied by a signed, freshness-checked verification artifact instead of a checkmark nobody earned.

flowchart TD
    SPEC["specs/scamshield.yml<br/>44 requirements, 21 domains"] --> ID["each: id, category, severity,<br/>given / when / then, verify level"]
    ID --> MATCH{verify level}
    MATCH -->|unit / component| JE[jest-expo and RNTL]
    MATCH -->|integration| VI[vitest and supertest]
    MATCH -->|e2e| MA[Maestro Android, Playwright admin]
    MATCH -->|native / manual| ART["signed verification artifact<br/>checksum plus signed commit, TTL 90 d"]
    JE --> COV[".spec-coverage/results.jsonl"]
    VI --> COV
    MA --> COV
    COV --> GATE{spec-coverage gate}
    ART --> GATE
    GATE -->|100% covered, all green,<br/>category matches, artifacts fresh| PASS([exit 0: deploy allowed])
    GATE -->|any uncovered, red, or stale| FAIL([nonzero: blocked])
Loading

The gate's own output is the signal of record. A green run reads:

scamshield v1: 100.0% covered (44/44)
  report: spec-coverage.md

Any uncovered, failing, category-mismatched, or stale-artifact requirement is listed by ID and exits nonzero, which blocks the deploy. A test claims a requirement by naming itself with the ID (for example specTest("[SCAM-CHECK-001] ...")); an ESLint rule fails any such test whose body has zero expect() calls, so coverage cannot be gamed with an empty assertion. See docs/TESTING.md for the full protocol.

What is proven (not just written)

Layer Requirements Proven by
Unit (data) message/number/email verdicts, trusted-sender and OTP and scam-number-in-message handling, verdict signals, native block decision, blocklist match jest-expo
Component (ui) check-button state, verified-caller and verified-sender badges, "why this result" signals, alerts list, stats strip, link warning jest-expo and React Native Testing Library
Integration (API) check (message/number/trusted-sender/number-in-message), blocklist and admin upload, reports listing, admin search and CSV export, alerts, stats and clustering, admin list/verify/auth, key-gated LLM and fallback, validation 400s, idempotent SQS consumer, push on admin review, durable Postgres store vitest and supertest
E2E (journey) check message/number/email, report then appears under Reports, admin verifies a report in the dashboard Maestro (Android emulator) and Playwright (admin web), against a live API
Manual (security) no secrets in the release bundle signed verification artifact (real bundle scan)

A few specifics worth calling out:

  • Human in the loop. A report is pending until an admin reviews it. The classifier's call is recorded as a suggestion, and the push to the reporter fires on the admin's verdict, mirroring the real "verify, then notify" flow.
  • Durable persistence. The report store, idempotent processing, clustering, stats, and the blocklist sit behind a ReportsStore boundary with two backends: Postgres (correct across the split Lambda and SQS topology) and an in-memory fallback (CI/local). The live API runs on Postgres with the app's tables isolated in a scamshield schema; the Postgres path is proven in CI against a service container (SCAM-DB-001).
  • Real native code. The Android CallScreeningService is wired by an Expo config plugin (expo prebuild injects it into the manifest and it compiles into the release APK), and its block decision is unit-tested. The iOS Call Directory and Message Filter extensions ship as code but need Apple signing and a device, so they are documented in docs/MOBILE.md, not claimed as proven.

Tech stack

Area Choice
App Expo (React Native), Expo Router, TypeScript strict; light/dark theming from a single token set
Admin React with Vite; Playwright for e2e
API NestJS (TypeScript), class-validator at the boundary, serverless-express on AWS Lambda and API Gateway
Async AWS SQS report-intake queue and an idempotent worker Lambda, with a dead-letter queue
Classifier on-device heuristic plus a server classifier with an LLM hook and a deterministic fallback; knows trusted senders, genuine OTP messages, and scam numbers embedded in text, and returns the signals behind every verdict
Push Expo push (APNs/FCM) when a report is confirmed a scam
Data Postgres (node-postgres, Neon's pooled endpoint) when DATABASE_URL is set, in-memory fallback for CI/local; OpenSearch-ready for clustering
Native Android CallScreeningService (Kotlin), iOS Call Directory and Message Filter extensions (Swift), wired by Expo config plugins
Infra AWS CDK (Lambda, API Gateway HTTP API, SQS, optional OpenSearch), GitHub Actions, OIDC deploy
Spec gate @platform/spec-test runner and coverage gate over a YAML spec

Run it

npm install
npm run test:spec                                # jest-expo and vitest, then the coverage gate
npm run -w @app/scamshield start                 # Expo dev server
npm run -w @service/scamshield-api start:dev     # API on :3000

The app reads the API base URL from EXPO_PUBLIC_API_URL; with no DATABASE_URL the API serves the in-memory store, so the full stack runs locally with zero cloud setup.

Testing

The spec gate is the single source of truth. npm run test:spec resets the shared coverage file, runs the app (jest-expo) and API (vitest) suites pointed at it, ingests any Maestro e2e report, then runs spec-coverage and exits nonzero unless all 44 requirements are satisfied at their declared verify level. The e2e and native layers run in CI (Android emulator, Playwright, signed-artifact check); locally those requirements show uncovered, which is expected. See docs/TESTING.md and docs/adr/0001-testing-architecture.md.

Deploy

The API deploys from CI on merge to main over GitHub OIDC (no long-lived AWS keys), running a DB migration, cdk deploy --all, and a live smoke test. To deploy by hand (needs an AWS account; see docs/DEPLOY.md and docs/SETUP.md):

cd infra/cdk/_template && npm install && npx cdk deploy

The mobile app ships through EAS: a full store build for native changes, or a JS-only OTA update for everything else (docs/MOBILE.md).

Structure

apps/app/
  app/(tabs)/        Check / Reports / Alerts screens (Expo Router tabs)
  components/        Shared UI (header, verdict card, alert list, stats strip, warning)
  lib/               API client, classifiers, theme tokens, device token, blocklist sync
  native/            CallScreeningService (Kotlin), Call Directory and Message Filter (Swift)
  plugins/           Expo config plugins that wire the native code at prebuild
  .maestro/          e2e flows
apps/admin/          React (Vite) admin verification dashboard and Playwright e2e
services/api/        NestJS API (reports, numbers, alerts, stats, admin, SQS consumer, classifier, push)
infra/cdk/           CDK: NestjsApi construct (Lambda, API Gateway, SQS, optional OpenSearch)
packages/spec-test/  Spec-driven test runner and coverage gate
specs/scamshield.yml The requirement spec
verification/        Signed artifacts for native and manual requirements

Roadmap

Built in phases, each landed with its spec and tests:

  1. The spine. Check-and-report a message, API and SQS intake, push on scam, input validation, no secrets in the bundle.
  2. The ScamShield surface. Check Call (/numbers/check), My Reports with verification status, Scam Alerts, and native call/SMS interception fed by a synced blocklist.
  3. Closing the gaps. Admin verification dashboard, email channel with a spoofed-sender heuristic, stats and "reported N times" clustering, in-app link warning, and a key-gated LLM classifier.
  4. Durable persistence. Postgres behind the ReportsStore boundary (Neon in production), making stats, clustering, and status correct across the split Lambda and SQS topology.
  5. More of the real product. Trusted-sender whitelist (CPF, IRAS, MOM, MAS, and others), scam-number-within-a-message escalation, and admin search and CSV export.
  6. Quality and operations. OTP-aware classification, per-signal "why" on every verdict, and admin blocklist upload.

Deliberately out of scope (these need real partner or government systems, so they are not faked): the WhatsApp/Telegram ScamShield Bot, iMessage/RCS, HTX and law-enforcement data sharing, bank/telco and Singpass Anti-Fraud integrations, the Google LLM-collaboration program, and real account auth (the admin uses a shared demo token; app checks are anonymous via an opaque device token).

License

MIT, see LICENSE.

Disclaimer

Unofficial, educational portfolio project. Not the official ScamShield. Do not use it to report real scams; use the official ScamShield channels.

About

Unofficial ScamShield replica: paste a suspicious message, number, or email for an instant scam verdict with the signals behind it, report scams for admin verification, and follow live stats and alerts. React Native app, NestJS API on AWS serverless, Postgres, admin dashboard, and a spec-driven test gate. Educational use only.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors