Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions docs/integration/.pages
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,16 @@ nav:
- AWS: aws
- Cloudflare: cloudflare.md
- Database: database
- GCP: gcp
- Servers: servers
- DevOps : devops
- Linux: linux.md
- Windows: windows.md
- Vercel : vercel.md
- Heroku: heroku.md
- Message Queues: message-brokers
- Airflow: airflow.md
- Cribl: cribl.md
- Security: security
- Mulesoft: mulesoft.md

158 changes: 158 additions & 0 deletions docs/integration/airflow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,158 @@
---
title: Apache Airflow Integration Guide
description: Collect and monitor Apache Airflow logs and metrics with OpenTelemetry Collector and visualize them in OpenObserve.
---

# Integration with Apache Airflow

This guide explains how to monitor **Apache Airflow** using the [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) (`otelcol`) and export logs, metrics, and traces to **OpenObserve** for visualization.

## Overview

Apache Airflow is a **workflow automation and orchestration tool** widely used for ETL pipelines, ML workflows, and data engineering tasks. Monitoring Airflow is critical for ensuring workflow reliability, debugging issues, and tracking system performance.
</br>

With OpenTelemetry and OpenObserve, you gain **real-time observability** into Airflow DAG runs, task execution, scheduler activity, and worker performance.

![Airflow architechture](images/airflow-arch.png)

## Steps to Integrate

??? "Prerequisites"
- OpenObserve account ([Cloud](https://cloud.openobserve.ai/web/) or [Self-Hosted](../../getting-started.md))
- Apache Airflow installed and running
- Basic understanding of Airflow configs (`airflow.cfg`)
- OpenTelemetry Collector installed

??? "Step 1: Configure Airflow for OpenTelemetry"

Edit `airflow.cfg` to enable OTel metrics:

```ini
[metrics]
otel_on = True
otel_host = localhost
otel_port = 4318
```

Restart Airflow services after updating config:

```bash
airflow db migrate
airflow scheduler -D
airflow webserver -D
```

??? "Step 2: Install OpenTelemetry Collector"

1. Download and install the OTel Collector:
```bash
wget https://github.com/open-telemetry/opentelemetry-collector-releases/releases/latest/download/otelcol-linux-amd64
chmod +x otelcol-linux-amd64
sudo mv otelcol-linux-amd64 /usr/local/bin/otelcol
```

2. Verify installation:
```bash
otelcol --version
```

??? "Step 3: Get OpenObserve Endpoint and Token"

1. In OpenObserve: go to **Data Sources → Otel Collector**
2. Copy the **Ingestion URL** and **Access Token**
![Get OpenObserve Ingestion URL and Token](../images/messagebroker/otel-metrics-source.png)

??? "Step 4: Configure OpenTelemetry Collector"

1. Create/edit config file:
```bash
sudo vi /etc/otel-config.yaml
```

2. Add Airflow configuration:
```yaml
receivers:
filelog/std:
include:
- /airflow/logs/*/*.log
- /airflow/logs/scheduler/*/*/*/*.log
start_at: beginning
otlp:
protocols:
grpc:
http:

processors:
batch:

exporters:
otlphttp/openobserve:
endpoint: OPENOBSERVE_ENDPOINT
headers:
Authorization: "OPENOBSERVE_TOKEN"
stream-name: airflow

service:
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/openobserve]
logs:
receivers: [filelog/std, otlp]
processors: [batch]
exporters: [otlphttp/openobserve]
traces:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/openobserve]
```

Replace placeholders with your OpenObserve details:

- `OPENOBSERVE_ENDPOINT` → API endpoint (e.g., `https://api.openobserve.ai`)
- `OPENOBSERVE_TOKEN` → Access token

??? "Step 5: Start OpenTelemetry Collector"

```bash
sudo systemctl start otel-collector
sudo systemctl status otel-collector
journalctl -u otel-collector -f
```

> Check logs to confirm data is being sent to OpenObserve.

??? "Step 6: Visualize Logs in OpenObserve"

1. Go to **Streams → airflow** in OpenObserve to query logs.Airflow logs collected include: DAG execution logs, Scheduler logs, Worker logs and Task execution logs

![Visualize Logs in OpenObserve](images/airflow-logs.png)


!!! tip "Prebuilt Dashboards"

</br>
[Prebuilt Airflow dashboards](https://github.com/openobserve/dashboards/tree/main/Airflow) are available. You can download the JSON file and import it.

## Troubleshooting

- **No Logs in OpenObserve**

- Ensure `filelog` receiver paths match your Airflow log directory.
- Verify Collector service is running.

- **Metrics Not Visible**

- Check `otel_on = True` in `airflow.cfg`.
- Confirm Airflow is sending metrics to `localhost:4318`.

- **Collector Fails to Start**

- Run dry check:
```bash
otelcol --config /etc/otel-config.yaml --dry-run
```
- Fix syntax or missing receivers.

104 changes: 104 additions & 0 deletions docs/integration/cribl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
title: Cribl Integration Guide
description: Learn how to route, filter, and analyze logs and traces from Cribl into OpenObserve using a webhook destination.
---

# Integration with Cribl

[Cribl Stream](https://cribl.io/) is a data engine designed to manage the flow of observability and security data. It allows you to route, filter, enrich, and transform data before forwarding it to your observability backend.

## Overview

**OpenObserve** is a high-performance, open-source observability platform built for real-time log and trace analytics. By integrating Cribl with OpenObserve, you can optimize data ingestion while reducing costs and improving monitoring.

## Steps to Integrate

??? "Prerequisites"

- Running **Cribl Stream** instance with access to the UI
- OpenObserve account ([Cloud](https://cloud.openobserve.ai/web/) or [Self-Hosted](../../getting-started.md))

??? "Step 1: Configure an Internal Source in Cribl"

First configure an internal Cribl source to generate sample logs.

1. Open **Cribl Stream UI** → navigate to **Worker Group → Routing → QuickConnect**.
![Cribl UI](images/cribl/quick-connect.png)

2. Select **Sources** from the left menu → **Add Source**.
![Add Source](images/cribl/add-source.png)

3. Choose **System → Internal** as the source type, and select **Cribl Internal**.
![Internal Source](images/cribl/cribl-internal.png)

4. Provide a name (e.g., `cribl`) → Save & Start.
![Configured Source](images/cribl/configured-source.png)

At this point, Cribl will generate test logs.

??? "Step 2: Configure a Webhook Destination in Cribl"

1. In the Cribl UI, go to **Destinations** → **Add Destination**.Select **Webhook**.
![Webhook Destination](images/cribl/select-webhook.png)

2. Configure the webhook:

- **Name:** `OpenObserve_Webhook`
- **Webhook URL:**
```
http://<OPENOBSERVE_HOST>/api/default/cribl/_json
```
- **HTTP Method:** `POST`
![Webhook Config](images/cribl/webhook-configuration.png)
- **Authentication:**
- Type: **Basic**
- Username: `O2_USER`
- Password: `O2_PASSWORD`

![Auth Config](images/cribl/webhook-auth.png)

3. Save and activate the destination.

??? "Step 3: Route Source Data to OpenObserve"

1. In Cribl, create a **Route**.
2. Connect the **Internal Source (cribl)** → **Webhook Destination (OpenObserve_Webhook)**.
3. Use **Passthru** for a simple route, then save.
![Route](images/cribl/connection-configuration.png)

> You can test the setup by sending sample logs. A success message indicates that OpenObserve received the data.
![Route Active](images/cribl/test-data.png)

??? "Step 4: Monitor Data in OpenObserve"

- Query Logs
- Go to **Logs → Streams → cribl**. You should see logs ingested from Cribl.
![Logs](images/cribl/query-logs.png)

- Query Traces
- Go to **Traces → Streams → cribl**. You should see traces ingested from Cribl.
![Traces](images/cribl/query-traces.png)
![Traces View](images/cribl/traces-view.png)

## Troubleshooting

??? "No data appearing in OpenObserve"
- Verify the **Route** in Cribl is active and connected to the Webhook destination.
- Double-check the **Webhook URL** format:
```
http://<OPENOBSERVE_HOST>/api/default/cribl/_json
```
- Ensure the correct **stream name** (`cribl`) is set in the URL.

??? "Authentication failures (401 Unauthorized)"
- Confirm the `O2_USER` and `O2_PASSWORD` are correct.
- Check if the user has **ingest permissions** in OpenObserve.

??? "Connection errors / timeouts"
- Make sure the OpenObserve host is reachable from Cribl.
- If using OpenObserve Cloud, ensure your firewall/VPC rules allow outbound connections.

??? "Logs are ingested but not parsed correctly"
- Confirm you’re using the `_json` endpoint in the Webhook URL.
- Check if the incoming data structure matches what OpenObserve expects (JSON payload).

7 changes: 7 additions & 0 deletions docs/integration/gcp/.pages
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
nav:

- GCP Integrations Overview: index.md
- GCP Logs: gcp-logs.md
- Google Cloud Run: cloud-run.md


Loading