Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .env.dist
Original file line number Diff line number Diff line change
@@ -1,5 +1,9 @@
GITHUB_USER=github: put here your github user
GITHUB_USER=github_username
MY_OPERATOR_IMAGE=ghcr.io/${GITHUB_USER}/openserverless-operator
# Optional: enable GHCR secure build+deploy shortcut in `task spark-standard`
# Provide a PAT with scopes: read:packages, write:packages (delete:packages optional)
GHCR_USER=github_username
GHCR_TOKEN=ghcr_pat_with_package_scopes

#enterprise monitoring
SLACK_API_URL=your-slack-webhook-url
Expand Down
3 changes: 3 additions & 0 deletions .env.sample
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@ REAL_HOME=your-actual-directory-outside-of-the-container
LINODE_CLI_TOKEN=linode-token-from-dashboard
# if you want to push your custom image to push in your repo
MY_OPERATOR_IMAGE=user/nuvolaris-operator
# GitHub Container Registry credentials for Spark operator builds
GHCR_USER=your-github-username
GHCR_TOKEN=your-github-token
# a slack incoming webhook to write messages
SLACK_WEBOOK=your-slack-webhook
# ip or hostname and ssh user with sudo of a server running ubuntu 22
Expand Down
14 changes: 8 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -52,10 +52,9 @@ ADD --chown=nuvolaris:nuvolaris deploy/postgres-operator-deploy /home/nuvolaris/
ADD --chown=nuvolaris:nuvolaris deploy/ferretdb /home/nuvolaris/deploy/ferretdb
ADD --chown=nuvolaris:nuvolaris deploy/runtimes /home/nuvolaris/deploy/runtimes
ADD --chown=nuvolaris:nuvolaris deploy/postgres-backup /home/nuvolaris/deploy/postgres-backup
ADD --chown=nuvolaris:nuvolaris deploy/spark /home/nuvolaris/deploy/spark
ADD --chown=nuvolaris:nuvolaris run.sh dbinit.sh cron.sh pyproject.toml poetry.lock whisk-system.sh /home/nuvolaris/

# prepares the required folders to deploy the whisk-system actions
RUN mkdir /home/nuvolaris/deploy/whisk-system
ADD --chown=nuvolaris:nuvolaris actions /home/nuvolaris/actions

# enterprise specific
Expand Down Expand Up @@ -89,7 +88,8 @@ ENV POETRY_CACHE_DIR=/opt/.cache
ENV PATH=${POETRY_HOME}/bin:$PATH

WORKDIR /home/nuvolaris
COPY --chown=nuvolaris:nuvolaris pyproject.toml poetry.lock /home/nuvolaris/
# Use numeric UID:GID to avoid requiring user in this stage
COPY --chown=1001:1001 pyproject.toml poetry.lock /home/nuvolaris/
RUN echo "Installing poetry" && \
# Install minimal dependencies
echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections && \
Expand Down Expand Up @@ -144,6 +144,9 @@ RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone &
# install taskfile
curl -sL https://taskfile.dev/install.sh | sh -s -- -d -b /usr/bin

# ensure whisk-system deploy folder exists and owned by nuvolaris
RUN install -d -o 1001 -g 1001 /home/nuvolaris/deploy/whisk-system

USER nuvolaris
WORKDIR /home/nuvolaris
# Copy virtualenv
Expand All @@ -153,8 +156,7 @@ COPY --from=deps --chown=nuvolaris:nuvolaris ${POETRY_HOME} ${POETRY_HOME}
# Copy the home
COPY --from=sources --chown=nuvolaris:nuvolaris ${HOME} ${HOME}
RUN poetry install --only main --no-interaction --no-ansi && rm -rf ${POETRY_CACHE_DIR}
# prepares the required folders to deploy the whisk-system actions
RUN mkdir -p /home/nuvolaris/deploy/whisk-system && \
./whisk-system.sh && \
# initialize whisk-system content
RUN ./whisk-system.sh && \
cd deploy && tar cvf ../deploy.tar *
CMD ["./run.sh"]
61 changes: 61 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,67 @@ task deploy
Once you have finished with development you can create a public image with `task publish` that will publish the tag and
trigger a creation of the image.

## Spark Operator Integration

OpenServerless includes a Spark operator that provides automated Spark cluster deployment and SparkJob CRD support for job execution. The Spark operator follows standard OpenServerless patterns for resource management.

### Quick Start

1. **Deploy Spark Operator and Cluster**:
```bash
task spark-standard
```

2. **Test SparkJob CRD**:
```bash
task sparkjob-deploy-crd
task sparkjob-test-examples
```

3. **Access Spark UI** (with port-forwarding):
```bash
kubectl port-forward -n nuvolaris service/spark-master 8080:8080 # Master UI
kubectl port-forward -n nuvolaris service/spark-history 18080:18080 # History Server
```

### SparkJob CRD

The SparkJob Custom Resource Definition enables automated execution of Spark applications. See [SparkJob Documentation](docs/SPARKJOB.md) for complete usage guide including:

- PySpark, Scala, and Java application examples
- Inline code execution
- Resource configuration
- Monitoring and logging
- Troubleshooting guide

### Standard OpenServerless Integration

The Spark operator follows standard OpenServerless patterns:
- **Templates**: Uses Jinja2 templates in `/nuvolaris/templates/`
- **Build Pipeline**: Standard GHCR workflow with `spark-build-ghcr`, `spark-push-ghcr`, `spark-all-ghcr` tasks
- **Configuration**: Uses `.env` pattern for GHCR credentials
- **Kopf Handlers**: Standard `@kopf.on.create`, `@kopf.on.delete` patterns for CRD lifecycle

### GHCR Integration (Optional)

For production deployments, configure GitHub Container Registry in `.env`:
```bash
GITHUB_USER=<your-github-username>
GHCR_USER=<your-github-username>
GHCR_TOKEN=<personal-access-token-with-packages-read-write>
```

Generate token: GitHub → Settings → Developer settings → Personal access tokens → scopes: `read:packages`, `write:packages`

### Monitoring

Check operator logs:
```bash
kubectl -n nuvolaris logs pod/nuvolaris-operator-spark | grep -i spark
```

If pods show `ImagePullBackOff`, verify registry configuration or use GHCR workflow.

## Prerequisites

1. Please set up and use a development VM [as described here](https://github.com/apache/openserverless)
Expand Down
Loading