AIBOM

Automatic AI Bill of Materials (AIBOM) generator for Python codebases that use LangChain and related AI/ML tooling.

What this project does

aibom_generator.py performs static analysis of a target Python repository and produces a JSON report (AI_BOM.json) that inventories AI components found in source code.

It scans all .py files recursively and extracts:

models
- LangChain LLM/chat model class usage (for example OpenAI, ChatOpenAI, HuggingFaceHub, Ollama)
- best-effort model identifiers from constructor args like model, model_name, model_id, checkpoint
datasets
- vector store related usage (for example FAISS, Chroma, Pinecone)
- best-effort dataset/index references such as path, persist_directory, index_name, collection_name
tools
- LangChain tool/agent-related calls (for example initialize_agent, load_tools, Tool, AgentExecutor)
frameworks
- imported AI frameworks (for example langchain, transformers, torch)
- installed package version when available via importlib.metadata

How it works (high level)

Recursively find Python files in the target directory.
Parse each file into a Python AST (ast.parse).
Visit imports and function/class calls with an AST visitor.
Match known LangChain/model/vectorstore/tool patterns.
Build a consolidated dictionary with top-level keys:
- models
- datasets
- tools
- frameworks
De-duplicate entries and write to JSON.
Print a short terminal summary.

Requirements

Python 3.8+
No external dependencies required for core functionality.

Usage

Basic

python aibom_generator.py /path/to/project

This writes AI_BOM.json in your current working directory.

Specify output location

python aibom_generator.py /path/to/project -o /path/to/output/AI_BOM.json

Scan current directory

python aibom_generator.py .

Example output structure

{
  "models": [
    {
      "type": "ChatOpenAI",
      "model": "gpt-4",
      "source_file": "app/pipeline.py",
      "details": {
        "call": "ChatOpenAI",
        "params": {
          "model": "gpt-4"
        }
      }
    }
  ],
  "datasets": [
    {
      "name": "FAISS",
      "type": "FAISS.from_documents",
      "used_for": "Vector store / dataset ingestion",
      "source_file": "app/retrieval.py",
      "details": {
        "persist_directory": "./faiss_index"
      }
    }
  ],
  "tools": [
    {
      "name": "initialize_agent",
      "purpose": "Agent/tool usage detected",
      "source_file": "app/agent.py",
      "details": {
        "call": "initialize_agent",
        "params": {}
      }
    }
  ],
  "frameworks": [
    {
      "name": "langchain",
      "version": "0.2.0"
    }
  ]
}

How to use the output

AI_BOM.json is intended to support inventory, review, and compliance workflows.

1) Governance and risk review

Identify what models are in use and where (source_file).
Review external dependencies/framework versions for patch and compatibility planning.
Flag unknown model identifiers ("model": "unknown") for manual follow-up.

2) Dependency and upgrade planning

Use frameworks to quickly check what AI libraries are present and which versions are installed.
Cross-check for deprecated APIs or vulnerable versions.

3) Data flow and retrieval visibility

Inspect datasets entries for vector store/index paths and collection names.
Confirm which files implement ingestion/indexing logic.

4) Agent/tool auditing

Review tools entries to understand where autonomous/tool-enabled logic exists.
Combine with code review in source_file paths for deeper behavior analysis.

Notes and limitations

This is static analysis, so results are best-effort.
Dynamic patterns (runtime imports, indirect wrappers, values built in many steps) may not resolve fully.
Some fields can be unknown when model names or dataset paths are not literal strings.
Version reporting depends on packages being installed in the environment where the script runs.

Quick validation

python aibom_generator.py . -o AI_BOM.json
python -m py_compile aibom_generator.py

License

See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
LICENSE		LICENSE
README.md		README.md
aibom_generator.py		aibom_generator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIBOM

What this project does

How it works (high level)

Requirements

Usage

Basic

Specify output location

Scan current directory

Example output structure

How to use the output

1) Governance and risk review

2) Dependency and upgrade planning

3) Data flow and retrieval visibility

4) Agent/tool auditing

Notes and limitations

Quick validation

License

About

Uh oh!

Releases

Packages

Languages

License

akumar0205/AIBOM

Folders and files

Latest commit

History

Repository files navigation

AIBOM

What this project does

How it works (high level)

Requirements

Usage

Basic

Specify output location

Scan current directory

Example output structure

How to use the output

1) Governance and risk review

2) Dependency and upgrade planning

3) Data flow and retrieval visibility

4) Agent/tool auditing

Notes and limitations

Quick validation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages