# CIRCT Object File Symbol Analysis

This notebook analyzes the defined and undefined symbols in CIRCT's object (`.o`) files to understand symbol resolution, identify potential linkage issues, and visualize inter-object-file dependencies.

We:
- Load symbol tables for all `.o` files
- Analyze which symbols are defined or undefined where
- Build graphs of object-file dependencies
- Identify any undefined (unresolved) symbols not provided in this build tree

In [1]:
import json

In [2]:
import sys

In [3]:
from collections import Counter

In [4]:
import networkx as nx

## Load Symbol Information

We begin by loading a JSON file that maps each object file to its defined (`T`) and undefined (`U`) symbols (as output by tools like `nm`).

In [5]:
with open('symbols-in-o-files.json', 'r') as f:
    symbols_in_o_files = json.load(f)

## Symbol Types Found

We find the following symbol types in the dataset:

- `T`, `t`: Functions or variables defined in this object file (uppercase = global, lowercase = local)
- `U`: Undefined symbols
- (Other types seen: `B`, `D`, `R`, `V`, `W`, `b`, `d`, `r`)

In this notebook, our focus is primarily on `T` (definitions) and `U` (unresolved references).

In [6]:
all_symbol_types = set()
for o_file, symbol_types_to_symbols in symbols_in_o_files.items():
    all_symbol_types.update(symbol_types_to_symbols)
all_symbol_types

{'B', 'D', 'R', 'T', 'U', 'V', 'W', 'b', 'd', 'n', 'r', 't', 'w'}

## Defined Symbols Analysis

We map all globally defined symbols (type `T`) to the list of object files that define them. Each symbol should ideally be defined once; multiple definitions suggest possible ODR (One Definition Rule) violations or, more commonly, separate "main" functions for different tools.

In [7]:
t_symbols_to_o_files = {}
for o_file, symbol_types_to_symbols in symbols_in_o_files.items():
    for symbol_type, symbols in symbol_types_to_symbols.items():
        if symbol_type.isupper() and symbol_type != 'U':
            for symbol in symbols:
                t_symbols_to_o_files.setdefault(symbol, []).append(o_file)

In [8]:
len(t_symbols_to_o_files)

836842

In [9]:
Counter(len(o_files) for o_files in t_symbols_to_o_files.values())

Counter({1: 737151,
         2: 41639,
         3: 16825,
         4: 9177,
         5: 5905,
         6: 3998,
         8: 2280,
         7: 2129,
         9: 2006,
         10: 1459,
         12: 1184,
         11: 1167,
         13: 925,
         15: 581,
         14: 570,
         17: 507,
         20: 424,
         18: 420,
         32: 405,
         24: 351,
         19: 343,
         16: 322,
         21: 296,
         28: 270,
         23: 263,
         25: 261,
         26: 258,
         27: 249,
         22: 238,
         41: 212,
         29: 194,
         72: 140,
         31: 129,
         40: 119,
         30: 116,
         47: 113,
         33: 108,
         61: 98,
         42: 92,
         46: 89,
         87: 89,
         35: 84,
         58: 82,
         69: 82,
         39: 79,
         43: 79,
         45: 79,
         44: 78,
         34: 74,
         37: 74,
         49: 73,
         66: 72,
         80: 72,
         294: 66,
         57: 65,
         54: 63,
   

**Observation:**  
Most symbols are uniquely defined (which is expected).

## Undefined Symbols Analysis

We then map each undefined symbol (type `U`) to the list of object files that use (need) it. These symbols are those which this set of object files expects to be provided elsewhereâ€”either by other object files, by static/dynamic libraries, or by the runtime.

In [10]:
u_symbols_to_o_files = {}
for o_file, symbol_types_to_symbols in symbols_in_o_files.items():
    for u_symbol in symbol_types_to_symbols.get('U', ()):
        u_symbols_to_o_files.setdefault(u_symbol, []).append(o_file)

In [11]:
len(u_symbols_to_o_files)

62553

In [12]:
Counter(
    len(t_symbols_to_o_files.get(u_symbol, []))
    for u_symbol in u_symbols_to_o_files
)

Counter({1: 61893,
         0: 614,
         6: 32,
         2: 3,
         5: 3,
         3: 2,
         11: 1,
         19: 1,
         7: 1,
         4: 1,
         244: 1,
         255: 1})

In [13]:
sorted(
    u_symbol
    for u_symbol in u_symbols_to_o_files
    if u_symbol not in t_symbols_to_o_files
)

['ZSTD_CCtx_setParameter',
 'ZSTD_compress2',
 'ZSTD_compressBound',
 'ZSTD_createCCtx',
 'ZSTD_decompress',
 'ZSTD_freeCCtx',
 'ZSTD_getErrorName',
 'ZSTD_isError',
 '_Exit',
 '_Unwind_Backtrace',
 '_Unwind_GetIP',
 '_Unwind_Resume',
 '_ZN5circt13importVerilogERN4llvm9SourceMgrEPN4mlir11MLIRContextERNS3_11TimingScopeENS3_8ModuleOpEPKNS_20ImportVerilogOptionsE',
 '_ZN5circt15getSlangVersionEv',
 '_ZN5circt17preprocessVerilogERN4llvm9SourceMgrEPN4mlir11MLIRContextERNS3_11TimingScopeERNS0_11raw_ostreamEPKNS_20ImportVerilogOptionsE',
 '_ZN5circt26populateLlhdToCorePipelineERN4mlir13OpPassManagerERKNS_25LlhdToCorePipelineOptionsE',
 '_ZN5circt27populateMooreToCorePipelineERN4mlir13OpPassManagerE',
 '_ZN5circt30populateVerilogToMoorePipelineERN4mlir13OpPassManagerE',
 '_ZN5circt3lsp13VerilogServer11addDocumentERKN4llvm3lsp10URIForFileENS2_9StringRefElRNSt3__16vectorINS3_10DiagnosticENS8_9allocatorISA_EEEE',
 '_ZN5circt3lsp13VerilogServer14getLocationsOfERKN4llvm3lsp10URIForFileERKNS3_8Posit

In [14]:
sorted(
    u_symbol
    for u_symbol in u_symbols_to_o_files
    if u_symbol in t_symbols_to_o_files
    and len(t_symbols_to_o_files.get(u_symbol, [])) > 1
)

['ExitOnErr',
 'LLVMFuzzerInitialize',
 'LLVMFuzzerTestOneInput',
 '_Z19registerMyExtensionRN4mlir15DialectRegistryE',
 '_ZN3toy4dumpERNS_9ModuleASTE',
 '_ZN3toy7mlirGenERN4mlir11MLIRContextERNS_9ModuleASTE',
 '_ZN4mlir3toy10ConstantOp6createERNS_9OpBuilderENS_8LocationENS_4TypeENS_17DenseElementsAttrE',
 '_ZN4mlir3toy10ConstantOp6createERNS_9OpBuilderENS_8LocationENS_9TypeRangeENS_10ValueRangeERKNS0_6detail28ConstantOpGenericAdaptorBase10PropertiesEN4llvm8ArrayRefINS_14NamedAttributeEEE',
 '_ZN4mlir3toy10ConstantOp6createERNS_9OpBuilderENS_8LocationEd',
 '_ZN4mlir3toy10ConstantOp8getValueEv',
 '_ZN4mlir3toy10ToyDialectC1EPNS_11MLIRContextE',
 '_ZN4mlir3toy11TransposeOp27getCanonicalizationPatternsERNS_17RewritePatternSetEPNS_11MLIRContextE',
 '_ZN4mlir3toy11TransposeOp6createERNS_9OpBuilderENS_8LocationENS_5ValueE',
 '_ZN4mlir3toy13GenericCallOp6createERNS_9OpBuilderENS_8LocationEN4llvm9StringRefENS5_8ArrayRefINS_5ValueEEE',
 '_ZN4mlir3toy21createLowerToLLVMPassEv',
 '_ZN4mlir3toy23cr

## Constructing the Object File Dependency Graph

A directed graph is created: nodes are `.o` files, and edges go from a file needing a symbol to the file that defines it. This graph captures direct symbol-based build dependencies between object files (not including external libraries).

In [15]:
dependency_graph = nx.DiGraph()

for u_symbol, u_symbol_o_files in u_symbols_to_o_files.items():
    if u_symbol in t_symbols_to_o_files:
        t_symbol_o_files = t_symbols_to_o_files[u_symbol]
        if len(t_symbol_o_files) == 1:
            t_symbol_o_file = t_symbol_o_files[0]
            for u_symbol_o_file in u_symbol_o_files:
                dependency_graph.add_edge(u_symbol_o_file, t_symbol_o_file)

In [16]:
len(dependency_graph.nodes)

4317

In [17]:
len(dependency_graph.edges)

80265

In [18]:
circt_opt_cpp_o_dependencies = nx.descendants(
    dependency_graph,
    'circt/tools/circt-opt/circt-opt.cpp.o'
)

In [19]:
len(circt_opt_cpp_o_dependencies)

850

In [20]:
implemented = set()
unimplemented = set()

for symbol_type, symbols in symbols_in_o_files[
    'circt/tools/circt-opt/circt-opt.cpp.o'
].items():
    if symbol_type.isupper() and symbol_type != 'U':
        for symbol in symbols:
            implemented.add(symbol)
    elif symbol_type == 'U':
        for symbol in symbols:
            unimplemented.add(symbol)

for o_file in circt_opt_cpp_o_dependencies:
    for symbol_type, symbols in symbols_in_o_files[
        o_file
    ].items():
        if symbol_type.isupper() and symbol_type != 'U':
            for symbol in symbols:
                implemented.add(symbol)
        elif symbol_type == 'U':
            for symbol in symbols:
                unimplemented.add(symbol)

unimplemented -= implemented

In [21]:
len(unimplemented)

390

In [22]:
sorted(unimplemented)

['ZSTD_CCtx_setParameter',
 'ZSTD_compress2',
 'ZSTD_compressBound',
 'ZSTD_createCCtx',
 'ZSTD_decompress',
 'ZSTD_freeCCtx',
 'ZSTD_getErrorName',
 'ZSTD_isError',
 '_Exit',
 '_Unwind_Backtrace',
 '_Unwind_GetIP',
 '_Unwind_Resume',
 '_ZNKSt3__110error_code7messageEv',
 '_ZNKSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE4findEcm',
 '_ZNKSt3__112basic_stringIcNS_11char_traitsIcEENS_9allocatorIcEEE7compareEmmPKcm',
 '_ZNKSt3__114error_category10equivalentERKNS_10error_codeEi',
 '_ZNKSt3__114error_category10equivalentEiRKNS_15error_conditionE',
 '_ZNKSt3__114error_category23default_error_conditionEi',
 '_ZNKSt3__119__shared_weak_count13__get_deleterERKSt9type_info',
 '_ZNKSt3__123__match_any_but_newlineIcE6__execERNS_7__stateIcEE',
 '_ZNKSt3__14__fs10filesystem4path10__filenameEv',
 '_ZNKSt3__14__fs10filesystem4path11__extensionEv',
 '_ZNKSt3__14__fs10filesystem4path13__parent_pathEv',
 '_ZNKSt3__16locale4nameEv',
 '_ZNKSt3__16locale9use_facetERNS0_2idE',
 '_ZNSt11logic_er

In [23]:
print("$AR", "rcs", "circt_opt_cpp_o_dependencies.a", " ".join(circt_opt_cpp_o_dependencies))

$AR rcs circt_opt_cpp_o_dependencies.a llvm-project/build/tools/mlir/lib/Dialect/Utils/CMakeFiles/obj.MLIRDialectUtils.dir/VerificationUtils.cpp.o circt/lib/Dialect/FIRRTL/FIRRTLUtils.cpp.o circt/lib/Conversion/SimToSV/SimToSV.cpp.o llvm-project/build/tools/mlir/lib/Dialect/IRDL/CMakeFiles/obj.MLIRIRDL.dir/IRDLSymbols.cpp.o circt/lib/Support/SymCache.cpp.o circt/lib/Dialect/LTL/LTLDialect.cpp.o llvm-project/build/tools/mlir/lib/Conversion/ArithToLLVM/CMakeFiles/obj.MLIRArithToLLVM.dir/ArithToLLVM.cpp.o llvm-project/build/tools/mlir/lib/Analysis/Presburger/CMakeFiles/obj.MLIRPresburger.dir/PresburgerSpace.cpp.o llvm-project/build/lib/Demangle/CMakeFiles/LLVMDemangle.dir/MicrosoftDemangleNodes.cpp.o circt/lib/Support/LoweringOptions.cpp.o llvm-project/build/tools/mlir/lib/Analysis/Presburger/CMakeFiles/obj.MLIRPresburger.dir/Utils.cpp.o llvm-project/build/lib/Support/CMakeFiles/LLVMSupport.dir/Threading.cpp.o circt/lib/Dialect/FIRRTL/Transforms/LowerMemory.cpp.o llvm-project/build/lib/IR

```
$CXX circt/tools/circt-opt/circt-opt.cpp.o circt_opt_cpp_o_dependencies.a -o circt-opt -lz -lzstd
```