DataFoundry is a data infrastructure platform built for the Eindhoven University of Technology Department of Industrial Design. Data Foundry contains tools for data collection and prototyping, aimed at design research and education. This repo contains the main web server component. The webserver has been built ontop of the Play Framework.
With DataFoundry, we aim to ease data collection and processing, making it extremely easy to connect various data sources and combine all data into a single data platform that encourages new forms of design research, mashing up, making and hacking.
- Versatile Data Management: Supports numerous dataset types including IoT (timeseries), Entity (key-value), Media (images), Diaries, Forms, and more.
- User & Project Administration: Full support for managing users, projects, and roles (admins, librarians, moderators). Making it easier to be approved in DPIA procedures.
- Authentication: Secure authentication with Single Sign-On (SSO) via OpenID Connect, Azure or SAML.
- Real-time Data Streaming: Seamless integration with the OOCSI real-time messaging ecosystem and our own API. Making cross-device data collection a breeze.
- External Service Integration: Connects with services like Rawgraphs, Azure, localAI, OpenAI, Telegram, Fitbit, and Google Fit.
- Built-in Tooling:
- Scripting: Run (sandboxed) server-side JavaScript to process, filter, and react to data.
- AI Tools: Integrated support for local and remote AI models (LLMs, Text-to-Speech). Including our own API wrapper to manage tokens.
- Notebooks: In-browser Python/JavaScript notebooks (Starboard) for data analysis and documentation.
- Transcription: On-premise audio/video transcription powered by Whisper.
- ESP Tools: Web-based flashing and file management for ESP32 microcontrollers.
- Client Libraries: Connect prototypes and applications from various platforms, including Python, Processing, JavaScript, and Unity.
- FAIR and Easy publishing workflow: Data Foundry allows you to directly upload projects to zenodo through fairly. Making it easier than ever to share your datasets and projects.
- Language: Java, Scala
- Framework: Play Framework
- Build Tool: sbt (Simple Build Tool)
- Database: H2 (in-memory for dev), Ebean ORM, PostgreSQL (for production)
- Containerization: Docker, Docker Compose, Podman
- Frontend: HTMX, jQuery, SASS
- Documentation: Jekyll
The project follows a modular structure:
- Root Project: The main entry point, configuration, and aggregation.
modules/common: Contains the core logic, routes, models, and dependencies (e.g., Pac4j for auth, Apache Jena for RDF, Lucene for search).modules/common/appContains most of the platform elements, including the data controllers, views and all other modules.modules/common/publicContains all public assets including imported code and frameworks/platforms (like ViperIDE, Starboard and twine). These can be found in the vendor folder.
Routes are split, with the root conf/routes delegating to common.Routes located in modules/common/conf/. This modularity allows for clear separation of concerns.
- Docker & Docker Compose (or Podman)
- Git
The easiest way to get DataFoundry up and running is with Docker. Internally we use Podman for development as this comes with additional security benefits, but Data Foundry should also be fully compatible with docker.
1. Clone the Repository
# Clone this repository
git clone https://github.com/data-foundry-id/data-foundry.git
cd data-foundry
# Initialize and update submodules
git submodule init
git submodule update2. Build the Base Image
docker build --tag datafoundrydocker:basecontainer -f .devcontainer/Dockerfile.base .3. Run in Development Mode This command builds the development image and starts the application using Docker Compose.
docker build --tag datafoundrydocker:development --target development . && docker compose -f DF-development.yaml upThe application will be available at http://localhost:9000. The environment exposes ports 9000 (App), 8001, and 9092.
4. Run in Production Mode For a production deployment, use the following command:
docker build --tag datafoundrydocker:production --build-arg BUILD_MODE=stage --target production . && docker compose -f DF-production.yaml upYou can also run the application directly on your host machine.
- Prerequisites: Java 25+, sbt
- Configure: Edit
conf/application.confwith your desired settings. For more information on configuration options, see the Configuration Documentation. - Run:
sbt run
Once running, you can interact with DataFoundry through its web interface or programmatically via its API. The documentation/ directory contains extensive guides, tutorials, and examples for everything from connecting your first datalogger to building a complete AI-powered chatbot.
DataFoundry exposes a comprehensive REST API for programmatic access.
- Swagger UI: A live, interactive API documentation is available in a running instance under the
/public/lib/swagger-ui/path. - Swagger Definition: The OpenAPI (Swagger) definition can be found in
conf/swagger.yml. - Reference Docs: Further details on the API can be found in the
documentation/_Referencedirectory.
Contributions from the community are welcome! Whether it's reporting a bug, proposing a new feature, or submitting a pull request, we appreciate your help.
Please see our CONTRIBUTING.md file for detailed guidelines.
This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
In short, this means you are free to use, study, share, and modify the software. If you run a modified version of this software on a network server and let other users interact with it, you must also make your modified source code available to them.
For the full license text, see the LICENSE file.
