GraphQL Coverage

Overview

GraphQL Coverage is a powerful tool designed to assess how extensively your GraphQL schema is utilized by your queries. By analyzing the fields defined in your schema and comparing them against the fields used in your queries, this tool provides valuable insights into the coverage and potential gaps within your GraphQL implementation.

While the Jupyter Notebook (graphql_coverage.ipynb) serves as a playground for exploratory analysis, the primary interface for users is the command-line script (graphql_coverage.py). This README will guide you through the installation, usage, and functionalities of the CLI tool.

Features

Comprehensive Field Extraction: Extracts all fields or only leaf fields from your GraphQL schema.
Query Analysis: Parses and analyzes GraphQL queries to determine field usage.
Coverage Calculation: Computes the percentage of schema fields utilized by queries.
Detailed Reporting: Generates comprehensive reports in CSV format and visualizes coverage with charts.
Configurable Depth: Allows aggregation of fields at specified depths for streamlined reporting.
Normalization Option: Supports case-insensitive comparison of field names.

Installation

Clone the Repository

git clone https://github.com/pligor/graphql-coverage.git
cd graphql-coverage

Install Dependencies

Ensure you have Python 3.7 or higher installed. Install the required Python packages using pip:
```
pip install -r requirements.txt
```

Usage

The primary tool is the graphql_coverage.py script, which can be executed via the command line. Below is a step-by-step guide to using the script effectively.

Command-Line Interface

Available Options

Option	Description	Default
`--schema_path`	Path to the GraphQL schema file.	`GraphQLClients/spaceXplayground/schema.graphql`
`--queries_path`	Path to the directory containing GraphQL queries.	`GraphQLClients/spaceXplayground/Queries`
`--only_leafs`	If set, only leaf fields will be considered.	`False`
`--depth`	Depth for reporting coverage. Aggregates fields at this level.	`1`
`--normalize_field_names`	If set, field names will be normalized (case-insensitive).	`False`
`--csv_path`	Path to the CSV file for the coverage report.	`schema_coverage_report.csv`
`--plot_path`	Path to the plot file for the coverage chart.	`schema_coverage_chart.png`

Examples

Basic Usage

Analyze the default schema and queries directory, extracting all fields:
```
python graphql_coverage.py
```
Only Leaf Fields

Focus the analysis on leaf fields:
```
python graphql_coverage.py --only_leafs
```

Specify Custom Paths

Provide custom paths for the schema and queries:

python graphql_coverage.py --schema_path path/to/schema.graphql --queries_path path/to/queries/

Normalize Field Names and Adjust Depth

Normalize field names for case-insensitive comparison and aggregate fields at depth 2:
```
python graphql_coverage.py --normalize_field_names --depth 2
```
Custom Report and Plot Paths

Define custom output paths for the CSV report and coverage chart:
```
python graphql_coverage.py --csv_path output/report.csv --plot_path output/chart.png
```

Output

Upon execution, the script performs the following steps:

Schema Loading
- Loads the entire GraphQL schema from the specified file.
- Extracts all fields or only leaf fields based on the --only_leafs flag.
Query Loading
- Recursively searches the specified directory for all .graphql query files.
- Reads each query, storing its file path and content.
Field Usage Extraction
- Parses each query, handling fragments, and extracts hierarchical field names.
- Counts how many queries each field appears in.
Coverage Calculation
- Compares the extracted schema fields against the fields used in queries.
- Calculates the coverage percentage.
Report Generation
- Generates a CSV report detailing field usage and coverage.
- Creates a visual chart representing the coverage.

After successful execution, you will find the schema_coverage_report.csv and schema_coverage_chart.png in your specified output paths.

Coverage Statistics Calculator

The calculate_csv_coverage_stats.py script provides aggregate statistics across multiple CSV coverage reports. This post-processing tool is useful when you have generated multiple coverage reports (e.g., for different GraphQL clients) and want to analyze overall coverage metrics.

Purpose

The script aggregates coverage statistics from multiple CSV files, calculating both per-file metrics and overall aggregated statistics. This helps you understand coverage patterns across different schemas or clients.

How It Works

For each CSV file, the script:

Counts rows where the Covered column is True (numerator)
Counts the total number of data rows excluding the header (denominator)
Calculates the coverage fraction: numerator / denominator

The script then computes two aggregate statistics:

Average of all fractions: Sum of all individual fractions divided by the number of CSV files
Overall fraction: Sum of all numerators divided by the sum of all denominators

Prerequisites

Python 3.x
pandas library (install via pip install pandas)

Usage

Basic Usage (Default Directory)

By default, the script processes CSV files in the results/csv/ directory:

python calculate_csv_coverage_stats.py

Custom Directory

You can specify a custom directory containing CSV files:

python calculate_csv_coverage_stats.py results/csv

Or use an absolute path:

# Windows
python calculate_csv_coverage_stats.py "C:\path\to\csv\directory"

# Linux/Mac
python calculate_csv_coverage_stats.py /path/to/csv/directory

Output

The script prints detailed statistics for each CSV file and aggregate statistics at the end:

Processing 3 CSV file(s)...

ExternalAuthClient_schema_coverage_report.csv:
  Covered entries: 65
  Total entries: 260
  Fraction: 0.2500 (65/260)

ExternalPublicClient_schema_coverage_report.csv:
  Covered entries: 15
  Total entries: 35
  Fraction: 0.4286 (15/35)

InternalClient_schema_coverage_report.csv:
  Covered entries: 156
  Total entries: 270
  Fraction: 0.5778 (156/270)

==================================================
STATISTICS:
==================================================
1. Average of all fractions: 0.4188
2. Overall fraction (sum of numerators / sum of denominators):
   0.4177 (236/565)
==================================================

Understanding the Statistics

Average of all fractions: This metric treats each CSV file equally, regardless of size. It's useful when you want to see the average coverage across different schemas or clients.
Overall fraction: This metric weights each file by its size. It represents the true overall coverage when considering all fields across all files together.

Notes

The script automatically excludes lock files (files starting with .~lock)
Files with no data rows (denominator = 0) are skipped with a warning
Errors in individual files are handled gracefully, allowing the script to continue processing remaining files
The script validates that the provided directory exists and is a valid directory

Jupyter Notebook Playground

The repository includes a Jupyter Notebook (graphql_coverage.ipynb) that serves as an interactive environment for experimenting with the coverage analysis. While the CLI script is intended for regular use, the notebook provides a deeper dive into each step of the process, leveraging comments and outputs to enhance understanding.

Contributing

Contributions are welcome! If you encounter issues, have questions, or want to suggest improvements, please raise an issue on GitHub.

License

This project is licensed under the GNU Affero General Public License v3.0.

Acknowledgements

Thank you for using GraphQL Coverage! Your feedback and contributions help improve the tool for everyone.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
GraphQLClients/spaceXplayground		GraphQLClients/spaceXplayground
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
__init__.py		__init__.py
calculate_coverage.py		calculate_coverage.py
calculate_csv_coverage_stats.py		calculate_csv_coverage_stats.py
extract_fields.py		extract_fields.py
extract_root_types.py		extract_root_types.py
generate_report.py		generate_report.py
get_schema_fields.py		get_schema_fields.py
graphql_coverage.ipynb		graphql_coverage.ipynb
graphql_coverage.py		graphql_coverage.py
load_queries.py		load_queries.py
load_schema.py		load_schema.py
parse_queries_and_extract_fields.py		parse_queries_and_extract_fields.py
parse_schema.py		parse_schema.py
requirements.txt		requirements.txt
run_graphql_coverage.sh		run_graphql_coverage.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GraphQL Coverage

Overview

Features

Installation

Usage

Command-Line Interface

Available Options

Examples

Output

Coverage Statistics Calculator

Purpose

How It Works

Prerequisites

Usage

Basic Usage (Default Directory)

Custom Directory

Output

Understanding the Statistics

Notes

Jupyter Notebook Playground

Contributing

License

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

pligor/graphql-coverage

Folders and files

Latest commit

History

Repository files navigation

GraphQL Coverage

Overview

Features

Installation

Usage

Command-Line Interface

Available Options

Examples

Output

Coverage Statistics Calculator

Purpose

How It Works

Prerequisites

Usage

Basic Usage (Default Directory)

Custom Directory

Output

Understanding the Statistics

Notes

Jupyter Notebook Playground

Contributing

License

Acknowledgements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages