Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions docs/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
title: Texera Documentation
---

{{% pageinfo %}}
Welcome to the **Texera Documentation Portal**! This is your central hub for understanding, deploying, and contributing to the Texera platform.
{{% /pageinfo %}}

Texera is an open-source data analytics and workflow management system. Use the sections below to find what you're looking for.

### 📚 [Getting Started](/docs/getting-started/)
New to Texera? Start here to set up your environment, install dependencies, and explore deployment options (Docker, AWS, GCP, Kubernetes, or Single Node).

### 🎓 [Tutorials](/docs/tutorials/)
Learn by doing. Explore step-by-step guides on how to use the UI, create datasets, manage workflows, and operate advanced features like Python UDFs and LLM integrations.

### 🧠 [Concepts](/docs/concepts/)
Deep dive into the theoretical framework behind Texera. Learn about Operators, Workflows, scalable execution, and how the core architecture hums under the hood.

### 🛠️ [Contribution Guidelines](/docs/contribution-guidelines/)
Want to build out Texera? Find resources on setting up a local microservice development environment, writing Java or Python operators, navigating making contributions, and understanding our code standards.

### 📖 [Reference & Examples](/docs/reference/)
Explore reference materials, past GUI screenshots, example workflows, and API specifications.

---

**Don't know where to begin?** Head over to the **[Overview](/docs/overview/)** to read the pitch on why you should use Texera, who it's built for, and how the architecture works at a high level.
38 changes: 38 additions & 0 deletions docs/concepts/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
title: Concepts
description: >
Overview of the key ideas and components behind Texera. This section introduces core concepts that help users and contributors understand how Texera works.
weight: 30
---

{{% pageinfo %}}
This section explains the foundational concepts behind Texera — the ideas, architecture, and components that make up the platform.
{{% /pageinfo %}}

Understanding Texera conceptually helps both **users** and **contributors** get the most out of the system.

For end users, it provides background on how workflows and operators interact to process data.
For contributors, it offers insight into the design principles and architecture that power Texera’s engine and user interface.

---

### What’s in this section

The **Concepts** section introduces the core ideas that define Texera’s design and operation:

- **Workflows:** How users visually build and manage data pipelines.
- **Operators:** The modular units that perform data transformations.
- **Execution Engine:** The core component that executes workflows efficiently.
- **Data Model:** How Texera represents, stores, and streams data.
- **Architecture:** The high-level structure connecting frontend, backend, and execution layers.

Each page below explores one of these areas in more depth, explaining how Texera’s internal components work together to support flexible, scalable, and interactive data analytics.

---

### When to read this section

If you’re new to Texera, start with the **[Overview](/docs/overview/)** page to understand what the platform does.
Then come here to learn *how it works under the hood*.

If you’re contributing to Texera or integrating it with other systems, the detailed concept pages — such as **Engine**, **Operator Framework**, and **Architecture** — will help you understand Texera’s internal design and extension points.
160 changes: 160 additions & 0 deletions docs/contribution-guidelines/_index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
---
title: "Contribution Guidelines"
description: "How to contribute to Texera code and documentation."
weight: 60
categories: [Texera, Contributing]
tags: [contributing, development, documentation, github, workflow]
---

{{% pageinfo %}}
Thank you for your interest in contributing to Texera! This guide explains how to contribute to both **Texera’s codebase** and **documentation**.
We follow a fork-based workflow and adopt the [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) standard for commit messages.
{{% /pageinfo %}}

# Contributing to Texera

Texera welcomes contributions from everyone — whether you’re fixing a small bug, improving documentation, or adding new features.

---

## 👥 Roles in the Project

| Role | Key Permissions | How to Join |
|------|-----------------|--------------|
| **Contributor** | Submit issues & PRs, join discussions | Start contributing — no formal process |
| **Committer** | Merge PRs, push code, vote on code changes | Nominated by PPMC based on quality contributions |
| **PPMC Member** | Governance, release voting, new committer approvals | Voted by existing PPMC members |
| **Mentor** | Guide project and ensure Apache compliance | Appointed by the Incubator PMC |

---

## 🛠 How to Contribute Code

### 1. Fork the Repository
Fork the [Texera repository](https://github.com/Texera/texera) on GitHub and clone it locally.

### 2. Find or Open an Issue
- Pick an existing issue or create a new one describing your proposal or bug.
- Discuss your approach with committers before coding to reach consensus.

### 3. Create and Submit a Pull Request
- Develop in a new branch of your fork.

> **Modifying the SQL schema?**
> Be sure to update `sql/changelog.xml` by adding a new `<changeSet>` element.
- When ready, submit a PR to the main Texera repository.
- **Allow edits from maintainers** to let committers make small fixes if needed.

#### PR Title and Commit Format
We use [Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/):
- Example PR titles:
- `feat: add new join operator`
- `fix(ui): resolve workflow panel crash`
- `chore(deps): bump dependency versions`
- The PR title becomes the final squashed commit message upon merge.

#### PR Description Should Include:
- **Purpose:** use `Closes #1234` to auto-close an issue.
- **Summary:** short overview of your changes.
- Optional: **design document**, **technical diagram**, or **screenshots**.

Avoid including:
- Local config files (e.g., `python_udf.conf`)
- Secrets or credentials
- Binary or build artifacts

---

## 🧪 Testing and Quality Checks

### Backend (Scala)
1. Run lint:
```bash
sbt "scalafixAll --check"
```
Fix with:
```bash
sbt scalafixAll
```
2. Run formatter:
```bash
sbt scalafmtCheckAll
```
Fix with:
```bash
sbt scalafmtAll
```
3. Execute tests:
```bash
cd core
sbt test
```

> For IntelliJ users: ensure the working directory matches the module (`amber` for engine tests, `core` for services).

### Frontend (Angular)
1. Run unit tests:
```bash
cd core/gui
ng test --watch=false
```
2. Format code:
```bash
yarn format:fix
```

Write `.spec.ts` tests for new functionality to ensure future safety.

---

## 🔍 Pull Request Review Process
1. Request a committer to review your PR.
2. Add labels (e.g., `fix`, `enhancement`, `docs`).
3. Wait for CI to pass ([GitHub Actions](https://github.com/Texera/texera/actions)).
4. Mark your PR as **draft** if it’s not ready.
5. Once approved, a committer will merge your PR.

---

## 📝 Apache License Header
All new files must include the Apache License header.
To automate this in IntelliJ:

1. Go to **Settings → Editor → Copyright → Copyright Profiles**.
2. Create a profile named **Apache** and add:
```
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership...
```
3. Set this as the default profile for the project.

---

## ✍️ Contributing to Documentation

Texera uses [Hugo](https://gohugo.io/) and the [Docsy](https://github.com/google/docsy) theme to build its website.
All documentation is stored in the [Texera GitHub repository](https://github.com/Texera/texera).

### Quick Steps
1. Click **Edit this page** at the top of any doc page to edit directly on GitHub.
2. Make your edits and open a Pull Request.
3. The site auto-deploys a preview for review via Netlify.
4. Wait for approval and merge.

### Preview Locally
To preview locally:
```bash
hugo server
```
Visit `http://localhost:1313` to view the site as you edit.

---

## 📚 Resources
- [Texera GitHub Repository](https://github.com/Texera/texera)
- [Conventional Commits Spec](https://www.conventionalcommits.org/en/v1.0.0/)
- [Hugo Documentation](https://gohugo.io/documentation/)
- [Docsy Guide](https://www.docsy.dev/docs/)
- [GitHub Pull Request Docs](https://help.github.com/articles/about-pull-requests/)
32 changes: 32 additions & 0 deletions docs/contribution-guidelines/apache-license-header.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
title: "Apache License header"
weight: 70
---

Every file must include the Apache License as a header. This can be automated in IntelliJ by
adding a Copyright profile:

1. Go to "Settings" → "Editor" → "Copyright" → "Copyright Profiles".
2. Add a new profile and name it "Apache".
3. Add the following text as the license text:

```
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
```
4. Go to "Editor" → "Copyright" and choose the "Apache" profile as the default profile for this
project.
5. Click "Apply".
Loading
Loading