# Airflow DAG & Papermill Troubleshooting Guide

This notebook helps troubleshoot common issues with Airflow DAG visibility and Papermill notebook integration in a Kubernetes deployment.

## Overview
- Check Airflow configuration and pod status
- Verify DAG directory structure and file permissions  
- Debug DAG loading and parsing issues
- Validate Papermill integration and notebook execution
- Test Kubernetes pod connectivity and resource allocation

## 1. Check Airflow Configuration and Status

First, let's check the status of our Airflow deployment and pods to identify any potential issues.

In [None]:
import subprocess
import json
import pandas as pd

def run_kubectl_command(command):
    """Execute a kubectl command and return the output"""
    try:
        result = subprocess.run(command, shell=True, capture_output=True, text=True)
        if result.returncode == 0:
            return result.stdout.strip()
        else:
            return f"Error: {result.stderr.strip()}"
    except Exception as e:
        return f"Exception: {str(e)}"

# Check Airflow pod status
print("=== Airflow Pod Status ===")
pod_status = run_kubectl_command("kubectl get pods -n airflow")
print(pod_status)

## 2. Verify DAG Directory and File Structure

Let's inspect the DAG directory structure and check if DAG files exist in the correct location.

In [None]:
# Get scheduler pod name
scheduler_pod = run_kubectl_command("kubectl get pods -n airflow -l component=scheduler -o jsonpath='{.items[0].metadata.name}'")
print(f"Scheduler pod: {scheduler_pod}")

# Check DAG directory contents
print("\n=== DAG Directory Contents ===")
dag_contents = run_kubectl_command(f"kubectl exec -n airflow {scheduler_pod} -c scheduler -- ls -la /opt/airflow/dags/")
print(dag_contents)

# Check if specific DAG files exist
print("\n=== Checking for specific DAG files ===")
dag_files = ['scala_spark_pipeline.py', 'papermill_example.py', 'data_processing_pipeline.py']
for dag_file in dag_files:
    check_result = run_kubectl_command(f"kubectl exec -n airflow {scheduler_pod} -c scheduler -- test -f /opt/airflow/dags/{dag_file} && echo 'EXISTS' || echo 'NOT FOUND'")
    print(f"{dag_file}: {check_result}")

## 3. Debug DAG Loading Issues

Let's check the Airflow scheduler and webserver logs for any DAG parsing errors or import failures.

In [None]:
# Check scheduler logs for DAG parsing errors
print("=== Scheduler Logs (last 50 lines) ===")
scheduler_logs = run_kubectl_command(f"kubectl logs -n airflow {scheduler_pod} -c scheduler --tail=50")
print(scheduler_logs)

# Check for specific error patterns
print("\n=== Checking for Import Errors ===")
import_errors = run_kubectl_command(f"kubectl logs -n airflow {scheduler_pod} -c scheduler | grep -i 'import.*error\\|modulenotfound\\|syntax.*error' | tail -10")
if import_errors:
    print(import_errors)
else:
    print("No import errors found in recent logs")

# Test DAG syntax by trying to parse one
print("\n=== Testing DAG Syntax ===")
dag_test = run_kubectl_command(f"kubectl exec -n airflow {scheduler_pod} -c scheduler -- python -m py_compile /opt/airflow/dags/scala_spark_pipeline.py")
print(f"DAG syntax test result: {dag_test if dag_test else 'No syntax errors'}")

## 4. Inspect Papermill Integration

Let's verify Papermill installation and check if notebook files are accessible within the Airflow environment.

In [None]:
# Check if Papermill is installed in Airflow scheduler
print("Checking Papermill availability in Airflow scheduler...")
result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'python', '-c', 'import papermill; print(papermill.__version__)'], 
                       capture_output=True, text=True)
print(f"Papermill in scheduler: {result.stdout.strip() if result.returncode == 0 else 'Not found or error'}")
if result.stderr:
    print(f"Error: {result.stderr}")

# Check if notebook files are accessible from Airflow
print("\nChecking notebook accessibility in Airflow scheduler...")
result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'ls', '-la', '/opt/airflow/notebooks/'], 
                       capture_output=True, text=True)
print(f"Notebooks in scheduler: {result.stdout if result.returncode == 0 else 'Directory not found'}")
if result.stderr:
    print(f"Error: {result.stderr}")

# Check if notebooks directory exists and has content
print("\nChecking for notebooks in common locations...")
locations = ['/opt/airflow/notebooks/', '/home/jovyan/work/notebooks/', '/tmp/notebooks/']
for location in locations:
    result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'find', location, '-name', '*.ipynb', '2>/dev/null'], 
                           capture_output=True, text=True)
    if result.returncode == 0 and result.stdout.strip():
        print(f"Found notebooks in {location}:")
        print(result.stdout)
    else:
        print(f"No notebooks found in {location}")

## 5. Validate Kubernetes Resources

Let's check the overall health of our Kubernetes deployment and ensure all services are properly configured.

In [None]:
# Check all pods status in airflow namespace
print("Checking all Airflow pods status...")
result = subprocess.run(['kubectl', 'get', 'pods', '-n', 'airflow', '-o', 'wide'], capture_output=True, text=True)
print(result.stdout)

# Check Jupyter Lab pod status
print("\nChecking Jupyter Lab pod status...")
result = subprocess.run(['kubectl', 'get', 'pods', '-n', 'jupyter', '-o', 'wide'], capture_output=True, text=True)
print(result.stdout)

# Check services and ensure they're accessible
print("\nChecking services...")
result = subprocess.run(['kubectl', 'get', 'svc', '-A'], capture_output=True, text=True)
print(result.stdout)

# Check if there are any recent events that might indicate issues
print("\nChecking recent events...")
result = subprocess.run(['kubectl', 'get', 'events', '--sort-by=.metadata.creationTimestamp', '-A', '--tail=10'], capture_output=True, text=True)
print(result.stdout)

## 6. Test Notebook Execution

Let's test if we can execute notebooks using Papermill from within the Kubernetes environment.

In [None]:
# Test Papermill execution from Airflow scheduler
print("Testing Papermill execution capability...")

# First, let's see if we can list Jupyter notebooks from the Airflow scheduler
print("Checking if notebooks are accessible from Airflow:")
result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'find', '/opt/airflow/', '-name', '*.ipynb'], 
                       capture_output=True, text=True)
if result.stdout:
    print("Found notebooks:")
    print(result.stdout)
else:
    print("No notebooks found in /opt/airflow/")

# Check if notebooks exist in Jupyter pod
print("\nChecking notebooks in Jupyter pod:")
result = subprocess.run(['kubectl', 'exec', '-n', 'jupyter', '-l', 'app=jupyter', '--', 'find', '/home/jovyan/', '-name', '*.ipynb'], 
                       capture_output=True, text=True)
if result.stdout:
    print("Found notebooks in Jupyter:")
    print(result.stdout)
else:
    print("No notebooks found in Jupyter")

# Test basic Papermill functionality
print("\nTesting basic Papermill import and version:")
result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'python', '-c', 
                        'import papermill; import sys; print(f"Papermill {papermill.__version__} on Python {sys.version}")'], 
                       capture_output=True, text=True)
print(result.stdout if result.returncode == 0 else f"Error: {result.stderr}")

## 7. File Permissions and Access Troubleshooting

Let's check file permissions and ensure proper access to DAGs and notebooks.

In [None]:
# Check file permissions for DAG files
print("Checking DAG file permissions...")
result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'ls', '-la', '/opt/airflow/dags/'], 
                       capture_output=True, text=True)
print(result.stdout if result.returncode == 0 else f"Error: {result.stderr}")

# Check who is the current user in the Airflow scheduler
print("\nChecking current user in Airflow scheduler:")
result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'whoami'], 
                       capture_output=True, text=True)
print(f"Current user: {result.stdout.strip()}")

# Check if we can read the DAG files
print("\nTesting DAG file readability:")
result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'python', '-c', 
                        'import os; files = [f for f in os.listdir("/opt/airflow/dags/") if f.endswith(".py")]; print(f"Python files: {files}")'], 
                       capture_output=True, text=True)
print(result.stdout if result.returncode == 0 else f"Error: {result.stderr}")

# Check Airflow configuration for DAG folder
print("\nChecking Airflow DAG folder configuration:")
result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'python', '-c', 
                        'from airflow.configuration import conf; print(f"DAGs folder: {conf.get(\\"core\\", \\"dags_folder\\")}")'], 
                       capture_output=True, text=True)
print(result.stdout if result.returncode == 0 else f"Error: {result.stderr}")

# Force DAG folder scan
print("\nForcing DAG folder scan...")
result = subprocess.run(['kubectl', 'exec', '-n', 'airflow', '-l', 'component=scheduler', '--', 'airflow', 'dags', 'list'], 
                       capture_output=True, text=True)
print("DAG scan result:")
print(result.stdout if result.returncode == 0 else f"Error: {result.stderr}")

## 8. Next Steps and Recommendations

Based on the results above, here are the recommended next steps:

### If DAGs are still not visible:
1. **Restart Airflow scheduler**: Sometimes a restart helps pick up new DAG files
2. **Check DAG syntax**: Ensure there are no Python syntax errors in the DAG files
3. **Verify imports**: Make sure all required Python packages are installed in the Airflow environment
4. **Check Airflow configuration**: Verify DAG folder configuration and scanning intervals

### If Notebooks are not accessible:
1. **Access Jupyter Lab**: Forward the Jupyter service port and access the web interface
2. **Copy notebooks to correct location**: Ensure notebooks are in the expected Jupyter working directory
3. **Install required packages**: Install Papermill and other dependencies in the Jupyter environment

### Verification Commands:
- Check Airflow UI: `kubectl port-forward -n airflow svc/airflow-webserver 8080:8080`
- Check Jupyter Lab: `kubectl port-forward -n jupyter svc/jupyter 8888:8888`
- Restart Airflow scheduler: `kubectl delete pod -n airflow -l component=scheduler`

Run the cells above to get specific diagnostic information, then follow the appropriate recommendations based on your results.