Skip to content

beldmian/cpe

Repository files navigation

CPE Parser and Tree Builder

A robust C library for parsing Common Platform Enumeration (CPE) data and constructing optimized hierarchical tree structures. Features an efficient Tree Encoding Format (TEF) with compression and data integrity verification.

Features

  • Fast, multi-threaded CPE parsing from XML dictionaries
  • Hierarchical tree representation of CPE entries
  • Optimized storage with the Tree Encoding Format (TEF)
  • Optional zlib compression
  • CRC32 data integrity verification
  • C library with clean, well-documented API
  • Go language bindings
  • Thread-safe work queue for parallel processing

Project Structure

  • cpe.h/c: Core CPE parsing and data structures
  • cpe_tree.h/c: Hierarchical tree representation and TEF format
  • cpe_lib.h/c: High-level library API
  • work_queue.h/c: Thread-safe work queue for parallel processing
  • cpe.go: Go language bindings

Usage

As C Library

#include <cpe/cpe.h>
#include <cpe/cpe_tree.h>
#include <cpe/cpe_lib.h>

// Initialize the library
cpe_init();

// Parse a CPE string
CPE *entry = parse_cpe("cpe:2.3:a:microsoft:windows:10:*:*:*:*:*:*:*");

// Create a tree and add the CPE
CPETreeNode *root = create_tree_node(CPEValue_Root, NULL);
add_cpe(root, entry);

// Write to TEF file
FILE *file = fopen("output.tef", "wb");
write_tree_TEF(file, root, 1); // 1 = use compression
fclose(file);

// Clean up
free_cpe(entry);
free_cpe_tree(root);
cpe_cleanup();

Using Go Bindings

import "github.com/beldmian/cpe/cpe"

// Initialize the library
cpe.Initialize()
defer cpe.Cleanup()

// Parse a CPE string
cpeEntry, _ := cpe.ParseCPE("cpe:2.3:a:microsoft:windows:10:*:*:*:*:*:*:*")
defer cpeEntry.Free()

// Create a tree and add the CPE
tree := cpe.NewCPETree()
defer tree.Free()
tree.AddCPE(cpeEntry)

// Write to TEF file
tree.WriteToFile("output.tef", true)

// Parse a dictionary file
tree, _ = cpe.ParseDictionary("cpe_dictionary.xml", 8)

TEF File Format

The Tree Encoding Format (TEF) is a binary format optimized for storing hierarchical CPE data.

  • 4 bytes: "TEF" magic string
  • 4 bytes: format version
  • 4 bytes: flags (bit 0: compression)
  • 4 bytes: checksum (CRC32 of data section)
  • 8 bytes: metadata section length
  • [Metadata section]:
    • 8 bytes: total node count
    • 8 bytes: max tree depth
    • 8 bytes: data section length
  • [Data section - recursive tree structure]:
    • 1 byte: node type
    • Variable-length encoding for children count
    • Variable-length encoding for value length
    • n bytes: value string (if length > 0)
    • [recursive entries for each child]

TEF Advantages

  1. Data integrity: CRC32 checksum verifies data integrity
  2. Compression: Optional zlib compression reduces file size
  3. Variable-length encoding: More efficient storage for small values
  4. Metadata: Statistics about the tree structure (node count, depth)
  5. Forward compatibility: Flags field allows for future extensions

Advanced Usage

Multithreaded Processing

The library uses a work queue for efficient multithreaded processing:

// Auto-detect optimal thread count
int threads = get_optimal_thread_count();

// Parse dictionary with multiple threads
CPETreeNode *root = cpe_parse_dictionary("input.xml", threads);

Error Handling

if (!cpe_init()) {
    fprintf(stderr, "Error: %s\n", cpe_get_last_error());
    return 1;
}

License

MIT License

About

A robust C library for parsing Common Platform Enumeration (CPE) data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published