Skip to content

Advanced developer docs#2569

Merged
mjwolf merged 9 commits intoelastic:mainfrom
mjwolf:advanced-developer-docs
Mar 19, 2026
Merged

Advanced developer docs#2569
mjwolf merged 9 commits intoelastic:mainfrom
mjwolf:advanced-developer-docs

Conversation

@mjwolf
Copy link
Copy Markdown
Contributor

@mjwolf mjwolf commented Nov 20, 2025

1. What does this PR do?

This adds advanced developer documentation for ECS tooling, which documents all steps of the generation pipeline, adds information on the ECS-OTel mapping process, adds pydoc to all functions. It also adds documentation on field re-use, subset and exclude filters.

2. Which ECS fields are affected/introduced?

N/A

3. Why is this change necessary?

To improve the ECS developer experience

4. Have you added/updated documentation?

YES

5. Have you built ECS and committed any newly generated files?

YES

6. Have you run the ECS validation tests locally?

YES

7. Anything else for the reviewers?


Commit Message

This adds complete developer-focused documentation to the ECS generation
pipeline, making the codebase significantly more accessible to contributors
and users.

This provides a complete reference for developers to understand, modify,
and extend the ECS generation pipeline.

🤖 Generated with Claude Code

Co-Authored-By: Claude noreply@anthropic.com

mjwolf and others added 4 commits November 20, 2025 11:26
This commit adds detailed pydoc and markdown documentation for all major
ECS generator modules, following a consistent documentation pattern:

Module Documentation (pydoc):
- otel.py: OpenTelemetry integration and validation
- markdown_fields.py: Markdown documentation generation
- intermediate_files.py: Intermediate format generation
- es_template.py: Elasticsearch template generation
- csv_generator.py: CSV field reference export
- beats.py: Beats field definition generation
- ecs_helpers.py: Shared utility functions

Each module now includes:
- Comprehensive module-level docstrings explaining purpose and usage
- Detailed function/method docstrings with Args, Returns, Examples
- Clear explanations of key concepts and behaviors

Markdown Guides (scripts/docs/):
- README.md: Central documentation index and getting started guide
- otel-integration.md: OTel semantic conventions integration
- markdown-generator.md: Documentation generation with Jinja2
- intermediate-files.md: Flat and nested format representations
- es-template.md: Elasticsearch template formats (composable & legacy)
- csv-generator.md: CSV field reference for spreadsheets
- beats-generator.md: Beats field definitions and default_field system
- ecs-helpers.md: Utility functions reference and patterns

Each guide includes:
- Architecture diagrams and data flow explanations
- Complete usage examples (CLI and programmatic)
- Troubleshooting sections with common issues
- Making changes guides for extensibility
- Performance considerations
- Testing strategies

This documentation provides developers with everything needed to understand,
modify, and extend the ECS build system.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Significantly improved schema-pipeline.md to make complex concepts more accessible:

Field Reuse Improvements:
- Added visual ASCII diagrams showing before/after states of reuse operations
- Detailed step-by-step examples of transitivity in action
- Clear distinction between foreign reuse (transitive) and self-nesting (non-transitive)
- Complete use case table explaining when to use each type
- Reuse order explanation with concrete examples

Subset Filtering Improvements:
- Complete rewrite with motivation and benefits section
- Visual representations of inclusion/exclusion
- Comprehensive syntax guide with 4 different approaches
- Three complete real-world subset examples (web, security, infrastructure)
- Field options documentation (docs_only, index flags)
- Multiple subset merging explained

New Sections:
- Quick Reference cheat sheets for rapid lookup
- Common Patterns section with 4 practical examples
- Enhanced troubleshooting with symptom → cause → solution format
- Debugging commands with copy-paste Python snippets

The documentation now uses visual learning, example-driven explanations,
and troubleshooting-first approach to make field reuse and subset
filtering concepts much easier to understand.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Added pydoc (docstrings) and markdown documentation for all Python scripts:

Generator Scripts (scripts/generators/):
- otel.py: OTel integration and validation
- markdown_fields.py: Markdown documentation generation
- intermediate_files.py: Intermediate format generation
- es_template.py: Elasticsearch template generation
- ecs_helpers.py: Shared utility functions
- csv_generator.py: CSV field reference export
- beats.py: Beats field definition generation

Schema Processing (scripts/schema/):
- loader.py: Schema loading and nesting
- cleaner.py: Validation and normalization
- finalizer.py: Field reuse and name calculation
- visitor.py: Field traversal utilities
- subset_filter.py: Subset filtering
- exclude_filter.py: Exclude filtering

Main Orchestrator:
- generator.py: Complete pipeline documentation

Documentation Files:
- scripts/docs/README.md: Updated with all module links
- scripts/docs/otel-integration.md: OTel integration guide
- scripts/docs/markdown-generator.md: Markdown generation guide
- scripts/docs/intermediate-files.md: Intermediate files guide
- scripts/docs/es-template.md: Elasticsearch template guide
- scripts/docs/ecs-helpers.md: Utility functions guide
- scripts/docs/csv-generator.md: CSV generator guide
- scripts/docs/beats-generator.md: Beats generator guide
- scripts/docs/schema-pipeline.md: Complete pipeline documentation

Documentation Style:
- Google-style docstrings with Args, Returns, Raises, Examples
- Comprehensive markdown guides with architecture, usage, troubleshooting
- Visual examples and flow diagrams
- Practical code examples
- Common patterns and best practices

This provides a complete developer reference for understanding, modifying,
and extending the ECS generation pipeline.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
@mjwolf mjwolf requested a review from a team as a code owner November 20, 2025 23:59
@github-actions
Copy link
Copy Markdown

Documentation changes preview: https://docs-v3-preview.elastic.dev/elastic/ecs/pull/2569/reference/

@github-actions
Copy link
Copy Markdown

🤖 GitHub comments

Expand to view the GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 21, 2025

Copy link
Copy Markdown
Contributor

@trisch-me trisch-me left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s hard to check all the files, so I have checked most important only.

@mjwolf mjwolf force-pushed the advanced-developer-docs branch from aa52965 to 0e224ff Compare January 9, 2026 04:58
@kgeller kgeller self-requested a review March 10, 2026 18:30
Copy link
Copy Markdown
Contributor

@kgeller kgeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me and cursor did a review. Overall I think this is great, just a couple comments to just clean it up a little bit

Suggestions:

  • Structural repetition across the generator markdown guides — The 7 module-specific guides share a significant amount of copy-pasted content. The same pipeline ASCII diagram, "Running the Generator" make block, "Related Files" list, and "Programmatic Usage" boilerplate appear nearly verbatim in all of them. I'd estimate ~800 lines (~16%) of the new markdown is structural duplication — consider extracting the shared content into the README.md and linking to it.
  • ecs-helpers.md is redundant with the new docstrings — This 562-line file documents every function with examples, but the PR also adds detailed Google-style docstrings with the same examples to the Python source. Maintaining both will inevitably lead to drift — I'd suggest deleting the markdown file and letting the docstrings be the single source of truth.
  • Simple utility function docstrings are over-verbose — Complex functions like normalize_reuse_notation, perform_reuse, and field_finalizer genuinely benefit from the detailed docstrings. However, trivial one-liners like is_yaml(), safe_list(), list_subtract(), and make_dirs() each got 15-17 line docstrings with Args/Returns/Examples/Notes — the original 1-line descriptions were sufficient for these.
  • USAGE.md lost some practical detail — The rewrite removed the full JSON examples for --template-settings / --mapping-settings and the --strict mode error walkthrough. These were useful for users who just read USAGE.md without diving into scripts/docs/ — consider restoring them inline or adding explicit "see X for JSON examples" pointers.
  • Generic troubleshooting padding — Some troubleshooting sections contain advice unrelated to ECS (e.g., how to use Excel's "Text to Columns" feature in csv-generator.md, or how to validate JSON with jq). Keeping only module-specific issues would tighten these up.
  • Stale "coming soon" linksintermediate-files.md has "coming soon" links at the bottom for CSV and ES template guides that already exist in this PR.

mjwolf and others added 2 commits March 19, 2026 09:40
- Delete scripts/docs/ecs-helpers.md (content lives in docstrings)
- Update scripts/docs/README.md to remove ecs-helpers.md references
- Remove boilerplate "Running the Generator" and "Related Files" sections
  from all module guides; replace with pointer to README.md
- Trim trivial docstrings in ecs_helpers.py (is_yaml, safe_list,
  list_subtract, make_dirs) to single-line or minimal form
- Restore --strict and --template-settings/--mapping-settings detail
  to USAGE.md including JSON examples and concrete error output
- Remove generic Excel/Unicode troubleshooting from csv-generator.md
- Remove jq tip from es-template.md debugging section
- Fix stale "coming soon" markers in intermediate-files.md

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 19, 2026

Vale Linting Results

Summary: 3 warnings, 3 suggestions found

⚠️ Warnings (3)
File Line Rule Message
docs/reference/ecs-converting.md 25 Elastic.DontUse Don't use 'please'.
docs/reference/ecs-field-reference.md 19 Elastic.DontUse Don't use 'please'.
docs/reference/index.md 44 Elastic.DontUse Don't use 'please'.
💡 Suggestions (3)
File Line Rule Message
docs/reference/ecs-artifacts.md 13 Elastic.WordChoice Consider using 'refer to (if it's a document), view (if it's a UI element)' instead of 'See', unless the term is in the UI.
docs/reference/ecs-converting.md 25 Elastic.WordChoice Consider using 'refer to (if it's a document), view (if it's a UI element)' instead of 'see', unless the term is in the UI.
docs/reference/ecs-field-reference.md 19 Elastic.WordChoice Consider using 'refer to (if it's a document), view (if it's a UI element)' instead of 'see', unless the term is in the UI.

The Vale linter checks documentation changes against the Elastic Docs style guide.

To use Vale locally or report issues, refer to Elastic style guide for Vale.

Remove some verbose documentation from docstrings and replace with
condensed versions.
@kgeller kgeller self-requested a review March 19, 2026 20:47
Copy link
Copy Markdown
Contributor

@kgeller kgeller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates look great, thanks @mjwolf !

@mjwolf mjwolf merged commit a661e2d into elastic:main Mar 19, 2026
8 checks passed
kgeller added a commit to kgeller/ecs that referenced this pull request Mar 23, 2026
Add _config.yml to exclude scripts/ and other internal directories
from Jekyll processing. The file scripts/docs/markdown-generator.md
introduced in elastic#2569 contains Jinja2 {% macro %} tags which are unknown
to Liquid, crashing the GitHub Pages build.
kgeller added a commit that referenced this pull request Mar 24, 2026
)

Add _config.yml to exclude scripts/ and other internal directories
from Jekyll processing. The file scripts/docs/markdown-generator.md
introduced in #2569 contains Jinja2 {% macro %} tags which are unknown
to Liquid, crashing the GitHub Pages build.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants