# EOEPCA Application Quality Validation and Usage Notebook

## Introduction

The **Application Quality Building Block** supports the transition of scientific algorithms from research prototypes to production-grade workflows. It provides tooling for:

- **Code Quality Analysis**: Static analysis tools like Flake8, Pylint, Ruff, and Bandit
- **Security Scanning**: Vulnerability detection in code and containers (e.g. Trivy)
- **Best Practice Checks**: Adherence to open reproducible science standards
- **Performance Testing**: Tools to test and optimise workflow execution
- **Pipeline Orchestration**: CWL-based pipelines integrating multiple analysis tools

This notebook validates the deployment and demonstrates usage of the Application Quality API.

## Setup

In [1]:
import os
import requests
import json
from pathlib import Path

import sys
sys.path.append('../')
from modules.helpers import get_access_token, load_eoepca_state, test_cell, test_results

Load `eoepca state` environment

In [2]:
load_eoepca_state()

In [3]:
platform_domain = os.environ.get("INGRESS_HOST")
http_scheme = os.environ.get("HTTP_SCHEME", "https")
application_quality_domain = f'{http_scheme}://application-quality.{platform_domain}'

# If you have a self-signed CA certificate for your cluster then this can be useful to avoid TLS errors below.
# First this certificate needs to be trusted, for example:
#  kubectl get secret -n cert-manager eoepca-ca-secret -o yaml -o jsonpath='{.data.ca\.crt}' | base64 -d > /usr/local/share/ca-certificates/eoepca-local-ca/eoepca-local-ca.crt
#  ln -s /usr/local/share/ca-certificates/eoepca-local-ca/eoepca-local-ca.pem /etc/ssl/certs/
#  update-ca-certificates
os.environ["REQUESTS_CA_BUNDLE"] = "/etc/ssl/certs/ca-certificates.crt"
verify_tls = True

print(f"Application Quality URL: {application_quality_domain}")

Application Quality URL: https://application-quality.test.eoepca.org


## Validate Application Quality Endpoints

In [4]:
endpoints = [
    ("Landing Page", f"{application_quality_domain}/"),
    ("API Root", f"{application_quality_domain}/api/"),
    ("Tools Endpoint", f"{application_quality_domain}/api/tools/"),
    ("Pipelines Endpoint", f"{application_quality_domain}/api/pipelines/"),
    ("Tags Endpoint", f"{application_quality_domain}/api/tags/"),
]

# These should all produce a 200 status
print("Validating Application Quality endpoints...\n")
for name, url in endpoints:
    try:
        response = requests.get(url, verify=verify_tls, timeout=10)
        status = "✅" if response.status_code == 200 else "❌"
        print(f"{status} {name} ({url}): {response.status_code}")
    except requests.exceptions.RequestException as e:
        print(f"❌ {name} ({url}): Connection error - {e}")

Validating Application Quality endpoints...

✅ Landing Page (https://application-quality.test.eoepca.org/): 200
✅ API Root (https://application-quality.test.eoepca.org/api/): 200
✅ Tools Endpoint (https://application-quality.test.eoepca.org/api/tools/): 200
❌ Pipelines Endpoint (https://application-quality.test.eoepca.org/api/pipelines/): 403
✅ Tags Endpoint (https://application-quality.test.eoepca.org/api/tags/): 200


## Exploring the API

The Application Quality API provides endpoints for managing analysis tools and pipelines. The main resources are:

- **Tools**: Individual analysis tools (Flake8, Bandit, Trivy, etc.)
- **Pipelines**: Sequences of tools that run against repositories
- **Tags**: Categories for organising tools by asset type or check type

In [5]:
# Check the API root to see available endpoints
api_url = f"{application_quality_domain}/api/"
response = requests.get(api_url, verify=verify_tls)

if response.status_code == 200:
    print("API Root Response:")
    print(json.dumps(response.json(), indent=2))
else:
    print(f"Error: {response.status_code}")

API Root Response:
{
  "pipelines": "https://application-quality.test.eoepca.org/api/pipelines/",
  "tools": "https://application-quality.test.eoepca.org/api/tools/",
  "tags": "https://application-quality.test.eoepca.org/api/tags/"
}


## List Available Analysis Tools

The tools endpoint returns all available analysis tools that can be integrated into quality pipelines. Each tool is a containerised analysis component implemented as a CWL subworkflow.

In [6]:
tools_url = f"{application_quality_domain}/api/tools/"
response = requests.get(tools_url, verify=verify_tls)

if response.status_code == 200:
    tools = response.json()
    print(f"Found {len(tools)} available tools:\n")
    for tool in tools:
        print(f"  • {tool.get('name', 'Unknown')} ({tool.get('slug', '')})")
else:
    print(f"Error fetching tools: {response.status_code}")

Found 11 available tools:

  • Application Package Validator (ap_validator_subworkflow)
  • Bandit (bandit_subworkflow)
  • Clone repo (clone_subworkflow)
  • Flake8 (flake8_subworkflow)
  • Jupyter Notebook Best Practices Checker (ipynb_specs_checker_subworkflow)
  • Papermill (papermill_subworkflow)
  • Pylint (pylint_subworkflow)
  • Ruff - Notebook (ruff_ipynb_subworkflow)
  • Ruff (ruff_subworkflow)
  • SonarQube (sonarqube)
  • Trivy (trivy_subworkflow)


## Explore Tool Details

Let's examine specific tools in detail to understand their capabilities and configuration options.

### Python Code Style: Flake8

In [7]:
flake8_url = f"{application_quality_domain}/api/tools/flake8_subworkflow/"
response = requests.get(flake8_url, verify=verify_tls)

if response.status_code == 200:
    tool = response.json()
    print(f"Tool: {tool.get('name')}")
    print(f"Slug: {tool.get('slug')}")
    print(f"Description: {tool.get('description', 'N/A')}")
    print(f"\nConfigurable Parameters:")
    
    user_params = tool.get('user_params', {})
    if user_params:
        print(json.dumps(user_params, indent=2))
    
    print(f"\nTags: {tool.get('tags', [])}")
else:
    print(f"Error: {response.status_code}")

Tool: Flake8
Slug: flake8_subworkflow
Description: flake8 - Style guide enforcement tool for Python

Configurable Parameters:
{
  "filter": {
    "regex": {
      "type": "string",
      "label": "regex",
      "default": ".*\\.py"
    }
  },
  "flake8": {
    "verbose": {
      "doc": "Increase the verbosity of Flake8\u2019s output.",
      "type": "boolean",
      "label": "Verbose",
      "default": false
    }
  }
}

Tags: [1, 5]


### Python Linting: Pylint

In [8]:
pylint_url = f"{application_quality_domain}/api/tools/pylint_subworkflow/"
response = requests.get(pylint_url, verify=verify_tls)

if response.status_code == 200:
    tool = response.json()
    print(f"Tool: {tool.get('name')}")
    print(f"Description: {tool.get('description', 'N/A')}")
    print(f"\nConfigurable Parameters:")
    
    user_params = tool.get('user_params', {})
    if user_params:
        for param, details in user_params.items():
            default = details.get('default', 'N/A')
            print(f"  • {param}: {default}")
else:
    print(f"Error: {response.status_code}")

Tool: Pylint
Description: pylint - Static code analyser tool for Python

Configurable Parameters:
  • filter: N/A
  • pylint: N/A


### Security Analysis: Bandit

Bandit finds common security vulnerabilities in Python code.

In [9]:
bandit_url = f"{application_quality_domain}/api/tools/bandit_subworkflow/"
response = requests.get(bandit_url, verify=verify_tls)

if response.status_code == 200:
    tool = response.json()
    print(f"Tool: {tool.get('name')}")
    print(f"Description: {tool.get('description', 'N/A')}")
    print(f"\nThis tool scans Python code for common security issues such as:")
    print("  • Hardcoded passwords")
    print("  • SQL injection vulnerabilities")
    print("  • Use of insecure functions")
    print("  • Weak cryptographic practices")
else:
    print(f"Error: {response.status_code}")

Tool: Bandit
Description: Bandit - Bandit is a tool designed to find common security issues in Python code

This tool scans Python code for common security issues such as:
  • Hardcoded passwords
  • SQL injection vulnerabilities
  • Use of insecure functions
  • Weak cryptographic practices


### Container Vulnerability Scanning: Trivy

Trivy scans container images for known vulnerabilities.

In [10]:
trivy_url = f"{application_quality_domain}/api/tools/trivy_subworkflow/"
response = requests.get(trivy_url, verify=verify_tls)

if response.status_code == 200:
    tool = response.json()
    print(f"Tool: {tool.get('name')}")
    print(f"Description: {tool.get('description', 'N/A')}")
    print(f"\nConfigurable Parameters:")
    
    user_params = tool.get('user_params', {})
    if user_params:
        print(json.dumps(user_params, indent=2))
else:
    print(f"Error: {response.status_code}")

Tool: Trivy
Description: The all-in-one open source security scanner
Use Trivy to find vulnerabilities (CVE) & misconfigurations (IaC) across code repositories, binary artifacts, container images, Kubernetes clusters, and more.

Configurable Parameters:
{
  "trivy": {
    "image": {
      "type": "string",
      "label": "Docker image",
      "default": "alpine/git"
    }
  }
}


### Application Package Validator

The AP Validator checks CWL files for OGC Best Practice compliance - particularly useful for EOEPCA application packages.

In [11]:
ap_validator_url = f"{application_quality_domain}/api/tools/ap_validator_subworkflow/"
response = requests.get(ap_validator_url, verify=verify_tls)

if response.status_code == 200:
    tool = response.json()
    print(f"Tool: {tool.get('name')}")
    print(f"Description: {tool.get('description', 'N/A')}")
    print(f"\nThis tool validates CWL workflows against OGC Application Package standards.")
else:
    print(f"Error: {response.status_code}")

Tool: Application Package Validator
Description: Validation tool for checking OGC compliance of CWL files for application packages.

This tool validates CWL workflows against OGC Application Package standards.


## Browse Tool Categories (Tags)

Tags categorise tools by the type of asset they analyse or the type of check they perform.

In [12]:
tags_url = f"{application_quality_domain}/api/tags/"
response = requests.get(tags_url, verify=verify_tls)

if response.status_code == 200:
    tags = response.json()
    print("Available Tags:\n")
    
    # Group tags by type
    asset_tags = [t for t in tags if t.get('name', '').startswith('asset:')]
    type_tags = [t for t in tags if t.get('name', '').startswith('type:')]
    other_tags = [t for t in tags if not t.get('name', '').startswith(('asset:', 'type:'))]
    
    if asset_tags:
        print("Asset Types:")
        for tag in asset_tags:
            print(f"  [{tag.get('id')}] {tag.get('name')}")
    
    if type_tags:
        print("\nCheck Types:")
        for tag in type_tags:
            print(f"  [{tag.get('id')}] {tag.get('name')}")
    
    if other_tags:
        print("\nOther:")
        for tag in other_tags:
            print(f"  [{tag.get('id')}] {tag.get('name')}")
else:
    print(f"Error: {response.status_code}")

Available Tags:

Asset Types:
  [1] asset: python
  [2] asset: other
  [3] asset: cwl
  [4] asset: notebook
  [9] asset: docker

Check Types:
  [5] type: best practice
  [6] type: app quality
  [7] type: app performance
  [8] type: init


## Filter Tools by Category

You can filter tools based on their tags. Let's find all Python-related tools.

In [13]:
tools_url = f"{application_quality_domain}/api/tools/"
response = requests.get(tools_url, verify=verify_tls)

if response.status_code == 200:
    tools = response.json()
    
    # Find the Python asset tag ID first
    tags_response = requests.get(f"{application_quality_domain}/api/tags/", verify=verify_tls)
    python_tag_id = None
    
    if tags_response.status_code == 200:
        tags = tags_response.json()
        for tag in tags:
            if 'python' in tag.get('name', '').lower():
                python_tag_id = tag.get('id')
                break
    
    if python_tag_id:
        python_tools = [t for t in tools if python_tag_id in t.get('tags', [])]
        print(f"Tools for Python analysis (tag ID {python_tag_id}):\n")
        for tool in python_tools:
            print(f"  • {tool.get('name')} - {tool.get('description', '')[:60]}...")
    else:
        print("Could not find Python tag. Listing all tools with 'python' in description:")
        for tool in tools:
            if 'python' in tool.get('description', '').lower() or 'python' in tool.get('name', '').lower():
                print(f"  • {tool.get('name')}")
else:
    print(f"Error: {response.status_code}")

Tools for Python analysis (tag ID 1):

  • Bandit - Bandit - Bandit is a tool designed to find common security i...
  • Flake8 - flake8 - Style guide enforcement tool for Python...
  • Pylint - pylint - Static code analyser tool for Python...
  • Ruff - Ruff - An extremely fast Python linter and code formatter, w...


## Pipelines

Pipelines combine multiple analysis tools into automated quality workflows. They can be triggered manually or automatically via the Notification & Automation service.

**Note**: Creating and executing pipelines requires authentication via OIDC.

In [14]:
pipelines_url = f"{application_quality_domain}/api/pipelines/"
response = requests.get(pipelines_url, verify=verify_tls)

print(f"Pipelines endpoint status: {response.status_code}")

if response.status_code == 200:
    pipelines = response.json()
    if pipelines:
        print(f"\nFound {len(pipelines)} pipeline(s):")
        for pipeline in pipelines:
            print(f"  • {pipeline.get('name', 'Unknown')} (ID: {pipeline.get('id')})")
    else:
        print("\nNo pipelines found. Pipelines are created via the web portal with authentication.")
else:
    print("\nNote: Pipeline management typically requires authentication.")
    print("Use the web portal to create and manage pipelines.")

Pipelines endpoint status: 403

Note: Pipeline management typically requires authentication.
Use the web portal to create and manage pipelines.


## Working with Authenticated Requests

For full functionality (creating pipelines, executing analysis, viewing results), you need to authenticate with the OIDC provider. Here's how to make authenticated requests:

In [None]:
try:
    username = os.environ.get("KEYCLOAK_TEST_USER", "")
    password = os.environ.get("KEYCLOAK_TEST_PASSWORD", "")
    client_id = os.environ.get("APP_QUALITY_CLIENT_ID", "application-quality")
    client_secret = os.environ.get("APP_QUALITY_CLIENT_SECRET", "")
    
    if username and password:
        # Attempt to get an access token
        token = get_access_token(
            username=username,
            password=password,
            client_id=client_id,
            client_secret=client_secret if client_secret else None
        )
        
        if token:
            print("✅ Successfully obtained access token", token[:10] + "...")
            
            # Make authenticated request
            headers = {"Authorization": f"Bearer {token}"}
            response = requests.get(pipelines_url, headers=headers, verify=verify_tls)
            
            if response.status_code == 200:
                pipelines = response.json()
                if pipelines:
                    print(f"\nFound {len(pipelines)} pipeline(s):")
                    for pipeline in pipelines:
                        print(f"  • {pipeline.get('name', 'Unknown')} (ID: {pipeline.get('id')})")
                else:
                    print("\nNo pipelines configured yet.")
            else:
                print(f"\nPipelines request returned: {response.status_code}")
    else:
        print("ℹ️  No credentials configured. Skipping authenticated requests.")
        print("   Set KEYCLOAK_USER and KEYCLOAK_PASSWORD environment variables for authenticated access.")
        
except Exception as e:
    print(f"Authentication not configured or failed: {e}")
    print("This is expected if you haven't set up OIDC credentials.")

✅ Successfully obtained access token eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICI2ZWVnV19aOHBMNUxWUW85aUxWV2k3TkJkbVNNNmdEWjhtX25wd3NKNEFnIn0.eyJleHAiOjE3NjU4MTY3ODUsImlhdCI6MTc2NTgxNjQ4NSwianRpIjoiZmM1ZmY3NDMtM2IxNy00NTc3LWE2NmMtYmIwZmZlNmIzOWE4IiwiaXNzIjoiaHR0cHM6Ly9hdXRoLnRlc3QuZW9lcGNhLm9yZy9yZWFsbXMvZW9lcGNhIiwiYXVkIjoiYWNjb3VudCIsInN1YiI6ImIyN2NiOTBjLTk3MDQtNGU2MC05ZTVjLTNlZWFlYTVmNzNmZiIsInR5cCI6IkJlYXJlciIsImF6cCI6ImFwcGxpY2F0aW9uLXF1YWxpdHkiLCJzaWQiOiIxNTRmM2I2ZS05NTNkLTQ5ZjQtYjFkNC1lZGE5NjUxNTcwOTkiLCJhY3IiOiIxIiwiYWxsb3dlZC1vcmlnaW5zIjpbIi8qIl0sInJlYWxtX2FjY2VzcyI6eyJyb2xlcyI6WyJvcGVuc2VhcmNoX3VzZXIiLCJvZmZsaW5lX2FjY2VzcyIsImRlZmF1bHQtcm9sZXMtZW9lcGNhIiwidW1hX2F1dGhvcml6YXRpb24iXX0sInJlc291cmNlX2FjY2VzcyI6eyJhY2NvdW50Ijp7InJvbGVzIjpbIm1hbmFnZS1hY2NvdW50IiwibWFuYWdlLWFjY291bnQtbGlua3MiLCJ2aWV3LXByb2ZpbGUiXX19LCJzY29wZSI6Im9wZW5pZCBwcm9maWxlIGVtYWlsIiwiZW1haWxfdmVyaWZpZWQiOnRydWUsIm5hbWUiOiJFb2VwY2EgVXNlciIsInByZWZlcnJlZF91c2VybmFtZSI6ImVvZXBjYXVzZXIiLCJnaXZlbl9uYW1lIjo

## Summarise Available Tools

Let's create a summary of all available tools and their purposes.

In [16]:
tools_url = f"{application_quality_domain}/api/tools/"
tags_url = f"{application_quality_domain}/api/tags/"

tools_response = requests.get(tools_url, verify=verify_tls)
tags_response = requests.get(tags_url, verify=verify_tls)

if tools_response.status_code == 200 and tags_response.status_code == 200:
    tools = tools_response.json()
    tags = {t['id']: t['name'] for t in tags_response.json()}
    
    print("=" * 70)
    print("APPLICATION QUALITY TOOLS SUMMARY")
    print("=" * 70)
    
    for tool in tools:
        print(f"\n{tool.get('name', 'Unknown')}")
        print("-" * len(tool.get('name', 'Unknown')))
        print(f"Slug: {tool.get('slug')}")
        
        desc = tool.get('description', 'No description')
        if len(desc) > 100:
            desc = desc[:100] + "..."
        print(f"Description: {desc}")
        
        tool_tags = tool.get('tags', [])
        if tool_tags:
            tag_names = [tags.get(t, f'Unknown({t})') for t in tool_tags]
            print(f"Categories: {', '.join(tag_names)}")
else:
    print("Error fetching tools or tags")

APPLICATION QUALITY TOOLS SUMMARY

Application Package Validator
-----------------------------
Slug: ap_validator_subworkflow
Description: Validation tool for checking OGC compliance of CWL files for application packages.
Categories: asset: cwl, type: best practice

Bandit
------
Slug: bandit_subworkflow
Description: Bandit - Bandit is a tool designed to find common security issues in Python code
Categories: asset: python, type: app quality

Clone repo
----------
Slug: clone_subworkflow
Description: git-clone - Clone a repository into a new directory
Categories: type: init, asset: other

Flake8
------
Slug: flake8_subworkflow
Description: flake8 - Style guide enforcement tool for Python
Categories: asset: python, type: best practice

Jupyter Notebook Best Practices Checker
---------------------------------------
Slug: ipynb_specs_checker_subworkflow
Description: Best practices checker for Jupyter notebook specs
Categories: asset: notebook, type: best practice

Papermill
---------
Slug:

## Web Portal Access

The Application Quality web portal provides a graphical interface for:

- Browsing available tools and their documentation
- Creating and managing pipelines
- Executing pipelines against Git repositories
- Viewing execution results and reports

Access the portal at the URL below. Note that full functionality requires OIDC authentication.

In [17]:
print(f"Web Portal URL: {application_quality_domain}")
print(f"\nWithout authentication, you can:")
print("  • Browse available analysis tools")
print("  • View tool documentation and parameters")
print("  • See existing pipeline configurations (read-only)")
print(f"\nWith OIDC authentication, you can also:")
print("  • Create and configure pipelines")
print("  • Execute pipelines against your repositories")
print("  • View detailed execution results and reports")
print("  • Access performance metrics and logs")

Web Portal URL: https://application-quality.test.eoepca.org

Without authentication, you can:
  • Browse available analysis tools
  • View tool documentation and parameters
  • See existing pipeline configurations (read-only)

With OIDC authentication, you can also:
  • Create and configure pipelines
  • Execute pipelines against your repositories
  • View detailed execution results and reports
  • Access performance metrics and logs


## Summary

This notebook demonstrated:

1. **Endpoint Validation**: Confirming the Application Quality BB is deployed and accessible
2. **API Exploration**: Understanding the structure of the REST API
3. **Tool Discovery**: Listing and examining available analysis tools
4. **Tag Categories**: Browsing tool categories for filtering
5. **Authentication**: How to make authenticated requests for full functionality

### Next Steps

- Access the web portal to create pipelines interactively
- Configure OIDC authentication for full API access
- Integrate quality pipelines with your CI/CD workflows via the Notification & Automation BB

### Further Resources

- [Application Quality Documentation](https://eoepca.readthedocs.io/projects/application-quality/en/latest/)
- [Application Quality GitHub Repository](https://github.com/EOEPCA/application-quality)
- [EOEPCA Deployment Guide](https://deployment-guide.docs.eoepca.org/current/building-blocks/application-quality/)
- [Common Workflow Language](https://www.commonwl.org/)