DataFoundry

DataFoundry is a data infrastructure platform built for the Eindhoven University of Technology Department of Industrial Design. Data Foundry contains tools for data collection and prototyping, aimed at design research and education. This repo contains the main web server component. The webserver has been built ontop of the Play Framework.

With DataFoundry, we aim to ease data collection and processing, making it extremely easy to connect various data sources and combine all data into a single data platform that encourages new forms of design research, mashing up, making and hacking.

Key Features

Versatile Data Management: Supports numerous dataset types including IoT (timeseries), Entity (key-value), Media (images), Diaries, Forms, and more.
User & Project Administration: Full support for managing users, projects, and roles (admins, librarians, moderators). Making it easier to be approved in DPIA procedures.
Authentication: Secure authentication with Single Sign-On (SSO) via OpenID Connect, Azure or SAML.
Real-time Data Streaming: Seamless integration with the OOCSI real-time messaging ecosystem and our own API. Making cross-device data collection a breeze.
External Service Integration: Connects with services like Rawgraphs, Azure, localAI, OpenAI, Telegram, Fitbit, and Google Fit.
Built-in Tooling:
- Scripting: Run (sandboxed) server-side JavaScript to process, filter, and react to data.
- AI Tools: Integrated support for local and remote AI models (LLMs, Text-to-Speech). Including our own API wrapper to manage tokens.
- Notebooks: In-browser Python/JavaScript notebooks (Starboard) for data analysis and documentation.
- Transcription: On-premise audio/video transcription powered by Whisper.
- ESP Tools: Web-based flashing and file management for ESP32 microcontrollers.
Client Libraries: Connect prototypes and applications from various platforms, including Python, Processing, JavaScript, and Unity.
FAIR and Easy publishing workflow: Data Foundry allows you to directly upload projects to zenodo through fairly. Making it easier than ever to share your datasets and projects.

Tech Stack

Language: Java, Scala
Framework: Play Framework
Build Tool: sbt (Simple Build Tool)
Database: H2 (in-memory for dev), Ebean ORM, PostgreSQL (for production)
Containerization: Docker, Docker Compose, Podman
Frontend: HTMX, jQuery, SASS
Documentation: Jekyll

Architecture

The project follows a modular structure:

Root Project: The main entry point, configuration, and aggregation.
modules/common: Contains the core logic, routes, models, and dependencies (e.g., Pac4j for auth, Apache Jena for RDF, Lucene for search).
modules/common/app Contains most of the platform elements, including the data controllers, views and all other modules.
modules/common/public Contains all public assets including imported code and frameworks/platforms (like ViperIDE, Starboard and twine). These can be found in the vendor folder.

Routes are split, with the root conf/routes delegating to common.Routes located in modules/common/conf/. This modularity allows for clear separation of concerns.

Getting Started

Prerequisites

Docker & Docker Compose (or Podman)
Git

Docker/Podman (Recommended)

The easiest way to get DataFoundry up and running is with Docker. Internally we use Podman for development as this comes with additional security benefits, but Data Foundry should also be fully compatible with docker.

1. Clone the Repository

# Clone this repository
git clone https://github.com/data-foundry-id/data-foundry.git
cd data-foundry

# Initialize and update submodules
git submodule init
git submodule update

2. Build the Base Image

docker build --tag datafoundrydocker:basecontainer -f .devcontainer/Dockerfile.base .

3. Run in Development Mode This command builds the development image and starts the application using Docker Compose.

docker build --tag datafoundrydocker:development --target development . && docker compose -f DF-development.yaml up

The application will be available at http://localhost:9000. The environment exposes ports 9000 (App), 8001, and 9092.

4. Run in Production Mode For a production deployment, use the following command:

docker build --tag datafoundrydocker:production --build-arg BUILD_MODE=stage --target production . && docker compose -f DF-production.yaml up

Local Development (sbt)

You can also run the application directly on your host machine.

Prerequisites: Java 25+, sbt
Configure: Edit conf/application.conf with your desired settings. For more information on configuration options, see the Configuration Documentation.
Run:
```
sbt run
```

Usage & Examples

Once running, you can interact with DataFoundry through its web interface or programmatically via its API. The documentation/ directory contains extensive guides, tutorials, and examples for everything from connecting your first datalogger to building a complete AI-powered chatbot.

API Documentation

DataFoundry exposes a comprehensive REST API for programmatic access.

Swagger UI: A live, interactive API documentation is available in a running instance under the /public/lib/swagger-ui/ path.
Swagger Definition: The OpenAPI (Swagger) definition can be found in conf/swagger.yml.
Reference Docs: Further details on the API can be found in the documentation/_Reference directory.

Contributing

Contributions from the community are welcome! Whether it's reporting a bug, proposing a new feature, or submitting a pull request, we appreciate your help.

Please see our CONTRIBUTING.md file for detailed guidelines.

License

This project is licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

In short, this means you are free to use, study, share, and modify the software. If you run a modified version of this software on a network server and let other users interact with it, you must also make your modified source code available to them.

For the full license text, see the LICENSE file.

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
.devcontainer		.devcontainer
.github		.github
conf		conf
dist		dist
documentation @ a2077e2		documentation @ a2077e2
examples		examples
lib		lib
modules/common		modules/common
project		project
public/lib/swagger-ui		public/lib/swagger-ui
test		test
.gitignore		.gitignore
.gitmodules		.gitmodules
AI_POLICY.md		AI_POLICY.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DF-development.yaml		DF-development.yaml
DF-production.yaml		DF-production.yaml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
build.sbt		build.sbt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DataFoundry

Key Features

Tech Stack

Architecture

Getting Started

Prerequisites

Docker/Podman (Recommended)

Local Development (sbt)

Usage & Examples

API Documentation

Contributing

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DataFoundry

Key Features

Tech Stack

Architecture

Getting Started

Prerequisites

Docker/Podman (Recommended)

Local Development (sbt)

Usage & Examples

API Documentation

Contributing

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages