Blau is a research platform for water-purification scientists. A genetic algorithm proposes nano-filter molecular lattices; first-principles quantum chemistry — running on real IBM Quantum hardware — scores them. A researcher goes from a Barcelona water sample to an exportable, lab-ready filter blueprint in minutes.
Glossary note for contributors. Blau uses genetic optimisation and quantum chemistry — not statistical learning. There are no neural networks, no training data, no learned models in this repository.
- The Mediterranean carries 1.25 million microplastic fragments per km² — 4× the density of the North Pacific garbage patch.
- The Llobregat river — the city's main drinking-water source — carries pharmaceutical residues, pesticides, and heavy metals from upstream industry.
- In 2008 Barcelona nearly ran out of water entirely.
The people fixing this are water-purification researchers — materials scientists who design nano-filters one molecular structure at a time. Their design loop is hand-written input files, hours of waiting, and weeks per iteration. Blau closes that loop.
For a researcher, the workflow is one screen:
Load a water sample → mark which pollutants matter → press Generate → get a 3D molecular structure of a candidate filter, per-pollutant removal-efficiency predictions, and an export to standard chemistry formats for the next stage of validation.
| Stage | What happens |
|---|---|
| 1. Measure | Manual entry, CSV import, USB lab equipment, GEMStat dataset, or live-streamed from the optional Qualcomm-powered field station. |
| 2. Propose | DEAP-driven genetic algorithm searches across 5 design genes (pore size, layer thickness, material type, functionalisation, doping). |
| 3. Score | First-principles quantum chemistry computes binding energy at the electron level — Hartree-Fock baseline (PySCF) and VQE on real IBM Quantum hardware (PennyLane + qiskit-ibm-runtime). |
| 4. Blueprint | A real atomic structure (honeycomb graphene with Kekulé bond orders, edge-attached functional groups, multi-layer stacking) — exportable as CSV, XYZ, or SDF. |
Every layer of every result is tagged with the level of theory that produced it (HF, VQE, empirical, proxied). For a research tool, provenance is the most important UX decision in the app.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
core/services/genetic_optimizer.py evolves filter designs across 5 genes:
| Gene | Range |
|---|---|
| Pore size | 0.3 – 2.0 nm |
| Layer thickness | 0.5 – 5.0 nm |
| Material type | graphene · CNT · graphene oxide · composite · MOF-like |
| Functionalisation density | 0 – 1 |
| Doping level | 0 – 1 (pyridinic-N replacement) |
Tournament selection · two-point crossover · Gaussian mutation · fitness = |binding_energy|. A ProcessPoolExecutor parallelises candidate evaluation.
For every candidate filter, Blau computes the binding energy at the electron level:
E_bind = E(filter + pollutant) − E(filter) − E(pollutant)
core/services/quantum_engine.py routes each molecule through one of three levels of theory, picked automatically by compute_binding_energy():
- Hartree-Fock (PySCF, STO-3G basis) — deterministic, millisecond baseline.
- VQE (Variational Quantum Eigensolver) with a UCCSD ansatz, executed via PennyLane's
qiskit.remotedevice on top of qiskit-ibm-runtime, on real IBM Quantum hardware whenIBM_QUANTUM_TOKENis set. 1024 shots per measurement. - Empirical fallbacks — van der Waals + DFT-D3-calibrated ionic models for systems STO-3G can't represent.
Heavy elements are a deliberate scientific choice, not a hidden caveat. STO-3G has no basis for Z > 36, so lead, mercury, and cadmium would crash the engine. Blau substitutes them with same-group lighter atoms (Pb → Ge, Hg → Zn, Cd → Zn) and surfaces the substitution in the UI so a reviewer always knows what was actually computed.
Genes alone aren't a filter — they have to become a real atomic structure. core/routers/filters.py does that work:
- Honeycomb graphene generated from the lattice constant a = 2.46 Å (C-C bond =
a/√3). - Kekulé bond-order assignment via BFS 2-colouring of the bipartite honeycomb graph — every aromatic carbon ends up with the chemically correct alternating-double-bond pattern.
- Edge-atom detection plus chemistry-aware functional-group attachment: -COOH for heavy-metal chelation, -NH₂ for anionic pollutants, -OH as the default hydrogen-bonding site.
- Multi-layer stacking with the correct van der Waals interlayer spacing (3.4 Å).
- Distance-threshold bond detection over the final atom set.
A curated 450+-entry pollutant map (core/services/pollutant_map.py) translates measured water parameters (heavy metals, OCPs, OPs, triazines, PAHs, VOCs, PFAS, phenols, pharmaceuticals, microplastics) into the dominant atomic-site representation that quantum chemistry can actually compute on.
Multi-tier, fully Dockerised. The Electron client never talks to the simulation engine directly — only the Celery worker does, so heavy compute and quantum jobs run out-of-band from any user-facing request.
flowchart LR
subgraph host [Host machine]
Electron[Electron_client]
end
subgraph compose [Docker_Compose_network]
Web[Django_web]
Worker[Celery_worker]
DB[(PostgreSQL)]
Redis[(Redis)]
Core[FastAPI_core]
SQLite[(SQLite_volume)]
end
subgraph quantum [IBM Cloud]
IBMQ[(IBM Quantum hardware)]
end
Electron -->|HTTP_JSON| Web
Web --> DB
Web -->|enqueue_tasks| Redis
Worker --> DB
Worker --> Redis
Worker -->|HTTP_CORE_SERVICE_URL| Core
Core --> SQLite
Core -.->|qiskit-ibm-runtime| IBMQ
- Electron client → Django
web: JWT-authenticated JSON over HTTP (http://localhost:8000from the host). The desktop app does not call the core service; only the backend worker does. - Django
web→ PostgreSQL & Redis: reads/writes application data; enqueues Celery jobs (e.g. filter generation). - Celery
worker→ PostgreSQL, Redis, and core: executes tasks and calls the simulation service atCORE_SERVICE_URL(defaulthttp://core:8000on the internal network). See server/backend/filters/services/runner.py (POST /filters/generateand status polling). - Core → IBM Quantum: when
IBM_QUANTUM_TOKENis set, VQE jobs are dispatched to a real IBM Quantum backend viaqiskit-ibm-runtime. - Core local state: SQLite on the
core-datavolume. For debugging, core is exposed on the host at port 8001.
For longitudinal campaigns on a real water source, an Arduino UNO Q (Qualcomm QRB2210 SoC) running a Python app inside the Uno Q App Lab reads a Modulino Thermo (HS3003) over I²C and POSTs measurements over WiFi to the Blau API. Measurements appear live in the researcher's active study. The platform doesn't require it; researchers running real fieldwork do. Source: firmware/blau_uno_q_app/.
| Component | Technologies |
|---|---|
| Core (simulation) | FastAPI, PySCF (Hartree-Fock), PennyLane + qiskit-ibm-runtime (VQE on IBM Quantum), DEAP (genetic algorithms), SQLite |
| Backend | Django, Django REST Framework, Celery, Redis, PostgreSQL, JWT |
| Desktop client | Electron, React 19, TypeScript, Vite, Tailwind CSS, Leaflet (maps), Recharts, 3Dmol.js (molecular viewer), serialport (USB devices) |
| Landing page | Vite + React, plain CSS |
| Field station | Arduino UNO Q (Qualcomm QRB2210), Modulino Thermo HS3003, Python, WiFi HTTP |
| Infrastructure | Docker Compose, PostgreSQL 16, Redis 7 |
| Path | Role |
|---|---|
| core/ | FastAPI simulation engine: /health, /filters/*. Runs quantum chemistry (PySCF + PennyLane VQE on IBM Quantum) and genetic algorithm optimisation (DEAP). Uses SQLite on a Docker volume (DB_PATH=/data/h2osim.db) and a process pool for heavy work. |
| server/backend/ | Django REST API, Celery tasks, and app code. In Compose, ./server/backend is mounted at /home/app/backend in the web and worker containers. |
| server/ | Docker image build, requirements.txt, and env.example for backend configuration. |
| client/ | Electron + Vite + React. API calls use VITE_API_BASE_URL; request paths include /api/... (see client/src/renderer/src/utils/api/config.ts). |
| landing/ | Optional Vite + React marketing/portfolio site; not wired into docker-compose.yml. Run locally with npm install and npm run dev. |
| firmware/ | Optional field station — Arduino UNO Q (Qualcomm QRB2210) Python app. |
| docs/ | Product documentation, DevPost submission copy, hardware setup. |
- Full stack: Docker and Docker Compose v2.
- Desktop client: Node.js (LTS) and npm, for client/.
- Landing page (optional): Node.js and npm in landing/.
Python dependencies are installed inside the Docker images from server/requirements.txt and core/requirements.txt. For local Python work outside Docker, use those files with a virtual environment.
From the repository root:
-
Copy the environment template and fill in secrets:
cp server/env.example server/.env
Set at minimum a strong
SECRET_KEY,DB_PASSWORD/POSTGRES_PASSWORD(same value), and Google OAuth values if you use Google login. See Environment variables below. -
Start all services:
docker compose up --build
-
Ports (default docker-compose.yml):
Service Host port Notes Django API ( web)8000 REST API and /api/health/Core ( core)8001 /healthand/filters/*PostgreSQL ( db)5434 Optional host access Redis (none) Reachable only inside the Compose network -
Smoke checks:
- API:
GET http://localhost:8000/api/health/ - Core:
GET http://localhost:8001/health
- API:
Optional: a separate benchmark-oriented compose file is docker-compose.benchmark.yml (different host ports for core/DB, and core tuned for benchmarking).
After starting the stack, you can run server/check_services.ps1 or server/check_services.sh if you want scripted checks against local URLs.
cd client
npm install
npm run devCreate client/.env (or use .env.local) with the Django API origin:
VITE_API_BASE_URL=http://localhost:8000The client builds request paths like /api/auth/login/; client/src/renderer/src/utils/api/config.ts accepts either a bare origin (http://localhost:8000) or a base URL ending in /api. Do not commit environment files that point to private or production APIs unless you intend to.
Django serves the React SPA from server/backend/static/client/dist/ (e.g. index.html plus assets/). Build the renderer and copy it there:
cd client
npm install
npm run build
# electron-vite writes the web bundle under out/renderer/
mkdir -p ../server/backend/static/client/dist
cp -r out/renderer/* ../server/backend/static/client/dist/On the server (paths match /home/app/backend in Docker), the same copy targets /home/app/backend/static/client/dist/. Public installers can live under /var/www/downloads/; Django exposes GET /downloads/client-latest.AppImage and GET /downloads/blau-setup.exe (override paths with LINUX_APPIMAGE_PATH / WINDOWS_SETUP_PATH in .env).
- Water measurement management — manual entry, CSV import, USB/serial lab equipment integration, GEMStat dataset browser
- Interactive map — Leaflet-based station visualisation with marker clustering
- Filter generation — multi-pollutant selection, quantum-hardware toggle, real-time progress polling
- 3D molecular viewer — 3Dmol.js visualisation of generated filter structures
- Charts and analytics — Recharts-based pollutant concentration and filter performance charts
cd landing
npm install
npm run devThe canonical template is server/env.example. Docker Compose loads server/.env for the db, web, and worker services.
| Variable | Purpose |
|---|---|
SECRET_KEY |
Django signing; must be long and random when DEBUG=False. |
DB_NAME, DB_USER, DB_PASSWORD, DB_HOST, DB_PORT |
Django database connection (DB_HOST=db and DB_PORT=5432 inside Compose). |
POSTGRES_DB, POSTGRES_USER, POSTGRES_PASSWORD |
Must match the database name, user, and password expected by Django (used by the PostgreSQL container). |
| Variable | Purpose |
|---|---|
DEBUG, ALLOWED_HOSTS, CORS_ALLOW_ALL_ORIGINS |
Dev vs production behaviour and CORS. |
CELERY_BROKER_URL, CELERY_RESULT_BACKEND |
Redis URLs (redis://redis:6379/0 in Compose). |
CELERY_TASK_TIME_LIMIT, CELERY_TASK_SOFT_TIME_LIMIT |
Task timeouts (soft limit should stay below hard limit). |
CORE_SERVICE_URL |
Base URL for the core service (http://core:8000 inside Docker). |
| Variable | Purpose |
|---|---|
EMAIL_HOST |
SMTP server (e.g. smtp.gmail.com). |
EMAIL_PORT |
SMTP port. |
EMAIL_USE_TLS |
Enable TLS (True/False). |
EMAIL_HOST_USER |
SMTP login email. |
EMAIL_HOST_PASSWORD |
SMTP app password. |
DEFAULT_FROM_EMAIL |
Sender address for outgoing emails. |
These are set in docker-compose.yml under the core service and can be overridden with host environment variables or a .env file:
| Variable | Default | Purpose |
|---|---|---|
DB_PATH |
/data/h2osim.db |
SQLite database path for simulation state. |
CORE_WORKERS |
2 |
Number of process pool workers for parallel simulation. |
USE_VQE |
1 |
Enable VQE quantum refinement (1) or use Hartree-Fock only (0). |
VQE_ACTIVE_ELECTRONS |
4 |
Number of active electrons in the VQE active space. |
VQE_ACTIVE_ORBITALS |
4 |
Number of active orbitals in the VQE active space. |
VQE_MAX_ITERATIONS |
80 |
Maximum VQE optimiser iterations. |
IBM_QUANTUM_TOKEN |
(empty) | IBM Quantum API token. When set, VQE runs on real quantum hardware instead of the simulator. |
IBM_QUANTUM_BACKEND |
ibm_sherbrooke |
IBM Quantum backend device name. |
IBM_QUANTUM_SHOTS |
1024 |
Number of measurement shots per VQE circuit on hardware. |
| Variable | Purpose |
|---|---|
GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET |
Google OAuth. |
LINUX_APPIMAGE_PATH |
Path to a desktop artifact served by Django (see server/backend/backend/settings.py). |
GEMSTAT_DATASET_DIR |
Reserved in env.example; ingestion uses the path you pass to sync_gemstat_measurements (see below). |
SECRET_KEY=dev-only-change-me-use-secrets-token-urlsafe-in-real-deploys
DEBUG=True
ALLOWED_HOSTS=localhost,127.0.0.1
CORS_ALLOW_ALL_ORIGINS=True
DB_NAME=thegreatfilter
DB_USER=postgres
DB_PASSWORD=local-dev-postgres-password
DB_HOST=db
DB_PORT=5432
POSTGRES_DB=thegreatfilter
POSTGRES_USER=postgres
POSTGRES_PASSWORD=local-dev-postgres-password
EMAIL_HOST=smtp.gmail.com
EMAIL_PORT=587
EMAIL_USE_TLS=True
EMAIL_HOST_USER=your-email@gmail.com
EMAIL_HOST_PASSWORD=your-app-password
DEFAULT_FROM_EMAIL=your-email@gmail.com
CELERY_BROKER_URL=redis://redis:6379/0
CELERY_RESULT_BACKEND=redis://redis:6379/0
CELERY_TASK_TIME_LIMIT=1800
CELERY_TASK_SOFT_TIME_LIMIT=1500
CORE_SERVICE_URL=http://core:8000
GOOGLE_CLIENT_ID=your-google-client-id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your-google-client-secretVITE_API_BASE_URL=http://localhost:8000Public measurement data can be loaded from the UNEP GEMS/Water Global Freshwater Quality Archive (GEMStat export). Use version v3 on Zenodo:
- Dataset page: https://zenodo.org/records/18459694
- Direct zip: GFQA_v3.zip
- DOI: 10.5281/zenodo.18459694
The archive is licensed CC BY 4.0; credit the dataset and authors as required on the record page.
- Create a folder
server/backend/dataset/on your machine (repo root relative path). - Extract or copy CSVs so all needed files sit in that folder (flat directory). The importer loads station and parameter catalogs from:
GEMStat_station_metadata.csvGEMStat_parameter_metadata.csvFiles whose names start withGEMStat_are not used as parameter timeseries inputs; other*.csvfiles in the same directory are candidates for--files.
Inside Docker, the same directory appears as /home/app/backend/dataset because ./server/backend is bind-mounted to /home/app/backend.
With the stack running, from the repository root, prefer Compose's service name (avoids guessing container names like thegreatfilter-web-1):
docker compose exec web python manage.py sync_gemstat_measurements /home/app/backend/dataset \
--files Temperature.csv pH.csv Water.csv Electrical_Conductance.csv Chloride.csv Sodium.csv \
Calcium.csv Magnesium.csv Potassium.csv Sulfur.csv \
--max-snapshots 50000Equivalent with an explicit container name:
docker exec -it thegreatfilter-web-1 python manage.py sync_gemstat_measurements /home/app/backend/dataset \
--files Temperature.csv pH.csv Water.csv Electrical_Conductance.csv Chloride.csv Sodium.csv \
Calcium.csv Magnesium.csv Potassium.csv Sulfur.csv \
--max-snapshots 50000Imports can take a long time and use substantial disk and database space. The --files list limits which parameter CSVs are read; the two GEMStat_* metadata CSVs above must still be present in the dataset directory.
- Built at HackUPC 2026 (Barcelona).
- Sponsor track: Qualcomm — Arduino UNO Q field station with the QRB2210 SoC.
- Quantum execution: IBM Quantum via
qiskit-ibm-runtime. - Public-data attribution: UNEP GEMS/Water Global Freshwater Quality Archive (GEMStat v3, Zenodo 10.5281/zenodo.18459694) — CC BY 4.0.
- Full DevPost submission copy:
docs/devpost-submission.md.







