# üåê NIC ETL - REST API

## üìã O que este notebook faz

Este notebook **documenta a API REST do NIC ETL** seguindo padr√µes REST conhecidos:

- üó∫Ô∏è **Mapa do site** com todos os endpoints dispon√≠veis
- üìö **Documenta√ß√£o OpenAPI** simplificada
- üîó **Links para recursos** e notebooks relacionados
- üìä **Esquemas de resposta** para cada endpoint

## üéØ Padr√£o de documenta√ß√£o

Segue o padr√£o **OpenAPI 3.0** (Swagger) com:
- Informa√ß√µes b√°sicas da API
- Lista de endpoints organizados por categoria
- Descri√ß√µes claras de cada opera√ß√£o
- Exemplos de resposta

## üöÄ Endpoints Dispon√≠veis

### üîÜ Entrada e Navega√ß√£o
- `GET /` - Ponto de entrada da API com links de navega√ß√£o
- `GET /health` - Status b√°sico da API

### üìö Documenta√ß√£o
- `GET /api/v1` - Documenta√ß√£o OpenAPI completa da API

### ‚ñ∂Ô∏è Pipeline ETL
- `GET /api/v1/pipelines/gitlab-qdrant/run` - Executar pipeline ETL completo

### üìä Monitoramento
- `GET /api/v1/pipelines/gitlab-qdrant/runs/last` - Relat√≥rio da √∫ltima execu√ß√£o

---

## üîÜ API: Home Page

`GET /`

In [None]:
# GET /
import json
response = {
    "status": "ok",
    "see": [
        "/health",
        "/api/v1"
    ]
}
print(json.dumps(response, indent=2, ensure_ascii=False))

## ü©∫ API: Status

`GET /health`

In [None]:
# GET /health
import json
response = {
    "status": "ok",
    "see": [ "/api/v1" ]
}
print(json.dumps(response, indent=2, ensure_ascii=False))

## üó∫Ô∏è API: Mapa OpenAPI

`GET /api/v1`

In [None]:
# GET /api/v1
import json
from datetime import datetime

# NIC ETL API Documentation - OpenAPI 3.0 Style
api_documentation = {
    "openapi": "3.0.0",
    "info": {
        "title": "NIC ETL Pipeline API",
        "description": "API REST para execu√ß√£o e monitoramento do pipeline ETL do NIC (N√∫cleo de Intelig√™ncia e Conhecimento)",
        "version": "1.0.0",
        "contact": {
            "name": "NIC ETL Team",
            "url": "http://nic.processa.info"
        }
    },
    "servers": [
        {
            "url": "http://localhost:8000",
            "description": "Servidor de desenvolvimento"
        }
    ],
    "paths": {
        "/": {
            "get": {
                "summary": "NIC REST API - Ponto de entrada",
                "description": "Status b√°sico e links de navega√ß√£o da API",
                "tags": ["Health"],
                "responses": {
                    "200": {
                        "description": "Status OK com links de navega√ß√£o",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "properties": {
                                        "status": {"type": "string", "example": "ok"},
                                        "see": {
                                            "type": "array",
                                            "items": {"type": "string"},
                                            "example": ["/health", "/api/v1"]
                                        }
                                    }
                                }
                            }
                        }
                    }
                }
            }
        },
        "/health": {
            "get": {
                "summary": "Health check geral",
                "description": "Verifica status b√°sico da API",
                "tags": ["Health"],
                "responses": {
                    "200": {
                        "description": "Status OK",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "properties": {
                                        "status": {"type": "string", "example": "ok"},
                                        "see": {"type": "string", "example": "/api/v1"}
                                    }
                                }
                            }
                        }
                    }
                }
            }
        },
        "/api/v1": {
            "get": {
                "summary": "Documenta√ß√£o da API",
                "description": "Retorna esta documenta√ß√£o da API em formato OpenAPI",
                "tags": ["Documentation"],
                "responses": {
                    "200": {
                        "description": "Documenta√ß√£o da API",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/api/v1/pipelines/gitlab-qdrant/run": {
            "get": {
                "summary": "Executar pipeline ETL",
                "description": "Executa o pipeline completo GitLab para QDrant atrav√©s do notebook etl.ipynb",
                "tags": ["Pipeline"],
                "responses": {
                    "200": {
                        "description": "Pipeline executado com sucesso",
                        "content": {
                            "text/plain": {
                                "schema": {
                                    "type": "string",
                                    "example": "‚úÖ etl.ipynb executado com sucesso"
                                }
                            }
                        }
                    },
                    "500": {
                        "description": "Erro na execu√ß√£o do pipeline",
                        "content": {
                            "text/plain": {
                                "schema": {
                                    "type": "string",
                                    "example": "‚ùå Erro ao executar etl.ipynb"
                                }
                            }
                        }
                    }
                }
            }
        },
        "/api/v1/pipelines/gitlab-qdrant/runs/last": {
            "get": {
                "summary": "√öltimo relat√≥rio de execu√ß√£o",
                "description": "Retorna o relat√≥rio completo da √∫ltima execu√ß√£o do pipeline ETL a partir do arquivo pipeline-data/report.json",
                "tags": ["Pipeline", "Monitoring"],
                "responses": {
                    "200": {
                        "description": "Relat√≥rio de execu√ß√£o",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "properties": {
                                        "pipeline_info": {"type": "object"},
                                        "context": {"type": "object"},
                                        "stages": {"type": "array"},
                                        "summary": {"type": "object"},
                                        "api_metadata": {"type": "object"}
                                    }
                                }
                            }
                        }
                    },
                    "404": {
                        "description": "Pipeline ainda n√£o foi executado",
                        "content": {
                            "application/json": {
                                "schema": {
                                    "type": "object",
                                    "properties": {
                                        "pipeline_status": {"type": "string", "example": "NOT_EXECUTED"},
                                        "message": {"type": "string", "example": "Pipeline has not been executed yet"}
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    },
    "components": {
        "schemas": {
            "PipelineStage": {
                "type": "object",
                "properties": {
                    "stage": {"type": "integer"},
                    "name": {"type": "string"},
                    "status": {"type": "string", "enum": ["SUCCESS", "FAILED", "RUNNING"]},
                    "duration_seconds": {"type": "number"}
                }
            },
            "ApiMetadata": {
                "type": "object",
                "properties": {
                    "endpoint": {"type": "string"},
                    "served_at": {"type": "string", "format": "date-time"},
                    "report_exists": {"type": "boolean"}
                }
            }
        }
    },
    "tags": [
        {
            "name": "Documentation",
            "description": "Documenta√ß√£o e metadados da API"
        },
        {
            "name": "Health",
            "description": "Verifica√ß√µes de sa√∫de do sistema"
        },
        {
            "name": "Pipeline",
            "description": "Opera√ß√µes do pipeline ETL"
        },
        {
            "name": "Monitoring",
            "description": "Monitoramento e relat√≥rios"
        }
    ]
}

# Adicionar metadados da resposta
api_documentation["_metadata"] = {
    "endpoint": "/api/v1",
    "served_at": datetime.now().isoformat() + "Z",
    "description": "Documenta√ß√£o autom√°tica da API NIC ETL",
    "available_endpoints": [
        "GET /",
        "GET /health", 
        "GET /api/v1",
        "GET /api/v1/pipelines/gitlab-qdrant/run",
        "GET /api/v1/pipelines/gitlab-qdrant/runs/last"
    ],
    "notebook_cells": {
        "root": "cell-2",
        "health": "cell-4", 
        "documentation": "cell-6",
        "pipeline_run": "cell-8",
        "pipeline_status": "cell-10"
    }
}

print(json.dumps(api_documentation, indent=2, ensure_ascii=False))

## üöÄ Pipeline GitLab-QDrant: Run

`POST /api/v1/pipelines/gitlab-qdrant/run`

In [None]:
# POST /api/v1/pipelines/gitlab-qdrant/run
import subprocess, json, time

NOTEBOOK = "etl.ipynb"

pipeline_result = {
    "pipeline": "gitlab-qdrant",
    "notebook": NOTEBOOK,
    "started_at": time.time(),
    "status": None,
    "stdout": None,
    "stderr": None,
    "error": None,
    "finished_at": None
}

try:
    result = subprocess.run(
        [
            "jupyter", "nbconvert",
            "--to", "notebook",
            "--execute",
            "--inplace",
            NOTEBOOK
        ],
        capture_output=True, text=True, check=True
    )
    pipeline_result["status"] = "succeeded"
    pipeline_result["stdout"] = result.stdout
    pipeline_result["stderr"] = result.stderr

except subprocess.CalledProcessError as e:
    pipeline_result["status"] = "failed"
    pipeline_result["stdout"] = e.stdout
    pipeline_result["stderr"] = e.stderr
    pipeline_result["error"] = f"returncode={e.returncode}"

except Exception as e:
    pipeline_result["status"] = "error"
    pipeline_result["error"] = str(e)

finally:
    pipeline_result["finished_at"] = time.time()
    print(json.dumps(pipeline_result, ensure_ascii=False, indent=2))


## üöÄ Pipeline GitLab-QDrant: Run (Via GET)

`GET /api/v1/pipelines/gitlab-qdrant/run`

In [18]:
# GET /api/v1/pipelines/gitlab-qdrant/run
import subprocess, json, time

NOTEBOOK = "etl.ipynb"

pipeline_result = {
    "pipeline": "gitlab-qdrant",
    "notebook": NOTEBOOK,
    "started_at": time.time(),
    "status": None,
    "stdout": None,
    "stderr": None,
    "error": None,
    "finished_at": None
}

try:
    result = subprocess.run(
        [
            "jupyter", "nbconvert",
            "--to", "notebook",
            "--execute",
            "--inplace",
            NOTEBOOK
        ],
        capture_output=True, text=True, check=True
    )
    pipeline_result["status"] = "succeeded"
    pipeline_result["stdout"] = result.stdout
    pipeline_result["stderr"] = result.stderr

except subprocess.CalledProcessError as e:
    pipeline_result["status"] = "failed"
    pipeline_result["stdout"] = e.stdout
    pipeline_result["stderr"] = e.stderr
    pipeline_result["error"] = f"returncode={e.returncode}"

except Exception as e:
    pipeline_result["status"] = "error"
    pipeline_result["error"] = str(e)

finally:
    pipeline_result["finished_at"] = time.time()
    print(json.dumps(pipeline_result, ensure_ascii=False, indent=2))


{
  "pipeline": "gitlab-qdrant",
  "notebook": "etl.ipynb",
  "started_at": 1755500948.1295977,
  "status": "succeeded",
  "stdout": "",
  "stderr": "[NbConvertApp] Converting notebook etl.ipynb to notebook\n[NbConvertApp] Writing 6404 bytes to etl.ipynb\n",
  "error": null,
  "finished_at": 1755501059.1871016
}


## üìä Pipeline GitLab-QDrant: Status

`GET /api/v1/pipelines/gitlab-qdrant/runs/last`

In [None]:
# GET /api/v1/pipelines/gitlab-qdrant/runs/last
import json
from pathlib import Path
from datetime import datetime

# Caminho do relat√≥rio
report_path = Path("pipeline-data/report.json")

# Verificar se o relat√≥rio existe
if not report_path.exists():
    # Retornar informa√ß√£o de que ainda n√£o foi executado
    response = {
        "pipeline_info": {
            "version": "1.0.0",
            "last_execution": None,
            "environment": "unknown"
        },
        "context": {},
        "stages": [],
        "summary": {
            "pipeline_status": "NOT_EXECUTED",
            "message": "Pipeline has not been executed yet. Please run the ETL pipeline first.",
            "total_duration_seconds": 0,
            "data_flow": {
                "input_files": 0,
                "processed_documents": 0,
                "total_chunks": 0,
                "embeddings_generated": 0,
                "vectors_stored": 0
            },
            "validation": {
                "overall": "NOT_AVAILABLE"
            }
        },
        "api_metadata": {
            "endpoint": "/api/v1/pipelines/gitlab-qdrant/runs/last",
            "served_at": datetime.now().isoformat() + "Z",
            "report_exists": False
        }
    }
    print(json.dumps(response, indent=2, ensure_ascii=False))
else:
    # Ler e retornar o relat√≥rio
    try:
        with open(report_path, "r", encoding="utf-8") as f:
            report = json.load(f)
        
        # Adicionar metadados da API
        report["api_metadata"] = {
            "endpoint": "/api/v1/pipelines/gitlab-qdrant/runs/last",
            "served_at": datetime.now().isoformat() + "Z",
            "report_file": str(report_path),
            "report_exists": True
        }
        
        # Retornar o relat√≥rio completo
        print(json.dumps(report, indent=2, ensure_ascii=False))
        
    except json.JSONDecodeError as e:
        # Erro ao decodificar JSON
        response = {
            "error": "Invalid report format",
            "message": f"The report file exists but contains invalid JSON: {str(e)}",
            "status_code": 500,
            "timestamp": datetime.now().isoformat() + "Z"
        }
        print(json.dumps(response, indent=2))
        
    except Exception as e:
        # Erro gen√©rico
        response = {
            "error": "Internal server error",
            "message": f"An error occurred while reading the report: {str(e)}",
            "status_code": 500,
            "timestamp": datetime.now().isoformat() + "Z"
        }
        print(json.dumps(response, indent=2))