ruby_base

ruby_base is the open-source Ruby analysis package that powers Graphops Ruby scanning. It parses Ruby files with tree-sitter, builds a project dictionary in Phase 1, loads and applies rules in Phase 2, and writes protobuf batches used by the Graphops backend and UI.

This package is open source and released under the MIT License.

What This Package Does

ruby_base is responsible for:

scanning Ruby projects and finding .rb files
parsing Ruby ASTs with tree-sitter
building a Phase 1 dictionary of discovered classes, modules, and namespaces
resolving semantic rules from inheritance chains
loading rule JSON from bundled data, local rules, cache, or backend
extracting metadata such as methods, calls, includes, callbacks, associations, validations, and namespace structure
writing protobuf+gzip batches for upload

Open Source Packages

Graphops currently publishes two open-source packages:

ruby_base performs the Ruby analysis. graphops_interface orchestrates scan execution, project init, backend configuration, and upload.

Package Structure

ruby_base.scanner - file discovery
ruby_base.parser - tree-sitter parsing and AST extraction
ruby_base.rules - rule loading and caching
ruby_base.builders - Phase 1 and Phase 2 builders
ruby_base.proto - protobuf schema, writer, and reader utilities
ruby_base.type_resolver - inheritance-based semantic rule resolution

Installation

Editable install for local development

From the monorepo root:

pip install -e ./graphops_interface
pip install -e ./ruby_base

From the package directory only:

pip install -e .

Install from your private Graphops package index

graphops_interface is not published on the public PyPI index, so ruby_base must be installed with the same private package index that hosts both packages.

pip install ruby_base \
  --extra-index-url "https://YOUR_TOKEN:@api.graphops.tech/pypi/simple/" \
  --trusted-host api.graphops.tech

You can also configure this once:

export PIP_EXTRA_INDEX_URL="https://YOUR_TOKEN:@api.graphops.tech/pypi/simple/"
export PIP_TRUSTED_HOST="api.graphops.tech"

Or in ~/.config/pip/pip.conf:

[global]
extra-index-url = https://YOUR_TOKEN:@api.graphops.tech/pypi/simple/
trusted-host = api.graphops.tech

Requirements

Python 3.10+
tree-sitter
tree-sitter-ruby
protobuf
graphops_interface

Quick Start

CLI

# Phase 1: build dictionary
ruby-base phase1 backend --exclude tmp --exclude vendor --output output/dictionary.json

# Phase 2: extract metadata using local rules
ruby-base phase2 backend --rules-dir backend/rules --output output/nodes.pb

With a backend URL for missing-rule downloads:

ruby-base phase2 backend \
  --rules-dir backend/rules \
  --backend-url http://localhost:3000 \
  --output output/nodes.pb

Enable strict ID validation:

RUBY_BASE_VALIDATE_IDS=1 ruby-base phase2 backend -o output/nodes.pb

Python API

from pathlib import Path
from ruby_base import TypeDictionaryBuilder, MetadataExtractionBuilder

builder = TypeDictionaryBuilder()
dictionary = builder.build(
    root_path=Path("backend"),
    excluded_paths=["tmp", "log", "vendor", "node_modules"],
    output_path=Path("output/dictionary.json"),
)

extractor = MetadataExtractionBuilder(rules_dir=Path("backend/rules"))
extractor.build(
    root_path=Path("backend"),
    excluded_paths=["tmp", "log", "vendor", "node_modules"],
    output_path=Path("output/nodes.pb"),
    return_nodes=False,
)

How The Two Phases Work

Phase 1: dictionary building

Phase 1:

scans Ruby files
parses class and module definitions
records inheritance references
writes a dictionary keyed by fully qualified class/module name

The current dictionary payload includes:

id - stable node identifier
type - structural type: class, module, or namespace
rule - semantic rule name used by Phase 2
file_path - canonical relative file path

Example:

{
  "ApplicationJob": {
    "id": "...",
    "type": "class",
    "rule": "active_job",
    "file_path": "app/jobs/application_job.rb"
  }
}

Phase 1 uses inheritance to derive known semantic rules. For example:

MyJob < ApplicationJob < ActiveJob::Base

This resolves to:

structural type: class
semantic rule: active_job

If no known inheritance-based rule is found, the fallback is:

class for classes
module for modules

Phase 2: rule-driven extraction

Phase 2:

loads the Phase 1 dictionary
determines which rules are needed
downloads missing rules only when required
parses each Ruby file
applies the rule for each node during extraction
resolves method calls using target-node rules where needed
writes protobuf batches for nodes and namespaces

Phase 2 uses the dictionary in two ways:

type for structural output concerns such as class/module/namespace handling
rule for semantic rule loading and application

Rules

Rules live under rules/ruby/{rule_name}.json, for example:

rules/ruby/base.json
rules/ruby/class.json
rules/ruby/module.json
rules/ruby/active_job.json
rules/ruby/sidekiq_worker.json

Each rule may include:

extract - which metadata to extract
extra - extra extraction flags layered on top of base
exclude - extraction fields to remove from base
call_mappings - call aliases such as perform_later -> perform
method_operations - semantic operation labels such as Active Record find -> read
callback_trigger_options - callback option parsing rules

Rule Loading Order

When Phase 2 needs a rule, RuleLoader checks these sources in order:

local --rules-dir
bundled package rules
local cache
backend API

This means missing rules are only downloaded when needed.

Backend Rule API

The Rails backend serves Ruby rules at:

GET /api/v1/rules/ruby
GET /api/v1/rules/ruby/:type
POST /api/v1/rules/ruby/batch

The batch endpoint is used to fetch missing rules efficiently after Phase 1.

Output

Phase 2 writes protobuf+gzip batches split by kind:

node batches
namespace batches
a manifest file describing the output

The protobuf writer lives under ruby_base.proto.

Building and Publishing

Build a wheel locally:

pip install build
python -m build
ls -la dist/

A valid wheel should be a normal .whl zip archive and should include code plus packaged rules data.

Development Notes

Helpful areas to inspect when changing behavior:

ruby_base/builders/type_dictionary_builder.py
ruby_base/builders/metadata_extraction_builder.py
ruby_base/type_resolver.py
ruby_base/rules/rule_loader.py
ruby_base/parser/ast_extractor.py

License

This package is licensed under the MIT License.

See LICENSE for the full license text.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
dist		dist
ruby_base.egg-info		ruby_base.egg-info
ruby_base		ruby_base
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ruby_base

What This Package Does

Open Source Packages

Package Structure

Installation

Editable install for local development

Install from your private Graphops package index

Requirements

Quick Start

CLI

Python API

How The Two Phases Work

Phase 1: dictionary building

Phase 2: rule-driven extraction

Rules

Rule Loading Order

Backend Rule API

Output

Building and Publishing

Development Notes

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

ruby_base

What This Package Does

Open Source Packages

Package Structure

Installation

Editable install for local development

Install from your private Graphops package index

Requirements

Quick Start

CLI

Python API

How The Two Phases Work

Phase 1: dictionary building

Phase 2: rule-driven extraction

Rules

Rule Loading Order

Backend Rule API

Output

Building and Publishing

Development Notes

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages