Transform Hours of Manual Documentation into Minutes of Automated Perfection
Automated technical documentation for Power BI semantic models using Model Context Protocol (MCP) and Python. Generate professional, comprehensive documentation in minutes instead of hours.
Extracts metadata from Power BI projects (.pbip / .pbism files) and automatically generates professional technical design documents including:
✅ Data Model Architecture — Star schema with fact/dimension tables
✅ Relationship Mapping — Complete relationship model with cardinality
✅ Measure Catalog — All 48+ measures with business logic & DAX code
✅ Security & Governance — RLS, access control, data classification
✅ Executive Summary — High-level architectural blueprint
Result: Professional 400+ line technical documentation. Delivery time: 3 minutes.
| Aspect | Manual | Automated |
|---|---|---|
| Time Required | 12+ hours | 3 minutes |
| Completeness | ~15% | 100% |
| Accuracy | Prone to errors | 100% precise |
| Updates | Never | Every run |
| Format | Scattered (.docx, .pptx) | Structured Markdown → DOCX → PDF |
Python 3.10+
pip install -r requirements.txt# Clone the repository
git clone https://github.com/yourusername/power-bi-docs-generator.git
cd power-bi-docs-generator
# Create virtual environment
python -m venv .venv
.venv\Scripts\activate # Windows
source .venv/bin/activate # macOS/Linux
# Install dependencies
pip install -r requirements.txtpython ConvertToPdf.py path/to/your/TechnicalDesign_YourReport.mdpython ConvertToPdf.pyAutomatically finds all TechnicalDesign_*.md files and converts them.
from AutomaticDocumentation import extract_pbip_metadata
metadata = extract_pbip_metadata("path/to/your/project.pbip")power-bi-docs-generator/
├── README.md
├── requirements.txt
├── .gitignore
│
├── AutomaticDocumentation.py
│ └── Core metadata extraction & documentation generation
│
├── ConvertToPdf.py
│ └── Markdown → DOCX → PDF conversion pipeline
│
├── semiautomatic_prompt.md
│ └── MCP workflow instructions & templates
│
├── templates/
│ └── Professional_Template.docx
│ └── Corporate Word template for styling
│
├── input/
│ └── Your .pbip/.pbism Power BI projects here
│
└── output/
├── TechnicalDesign_[ProjectName].md
├── TechnicalDesign_[ProjectName].docx
└── TechnicalDesign_[ProjectName].pdf
# Reads Power BI project structure
metadata = extract_pbip_metadata("project.pbip")
# Returns: tables, columns, measures, relationships as JSON# Applies professional documentation template
doc = apply_template(metadata, "templates/structure.md")
# Returns: Markdown with full documentationMarkdown → Pandoc → DOCX (with styling) → Word COM → PDF
The pipeline applies:
- Corporate styling (colors, fonts, headers)
- Table formatting (alternating rows, proper widths)
- Code block styling (GitHub dark theme for DAX)
- Professional spacing (keep headers with paragraphs, prevent orphans)
- 48 measures across 3 folders
- 13 dimension tables
- 9 relationships
- Complex DAX expressions
# Technical Design & Specifications
## 1. Introduction & Executive Summary
[Executive overview and project scope]
## 4.2 Data Model
### 4.2.1 Star Schema Architecture
**Fact Tables**
| Fact Table | Purpose | Characteristics |
|---|---|---|
| fact_emails | Central email transactions | 22 columns, PK: Email id |
**Dimension Tables**
| Table | Purpose | Characteristics |
|---|---|---|
| dim_contacts | Personnel master | 6 attributes |
| dim_department | Organizational units | Department lookup |
### Measures (Complete Catalog)
#### TotalEmailsSent
**Business Logic:** Foundational count of all email transactions...
```dax
COUNTROWS(fact_emails)Business Logic: Composite metric synthesizing outbound and inbound activity...
[TotalEmailsSent] + [TotalEmailsReceivedByToName]
[... 46 more measures documented ...]
---
## 🎨 Customization
### Modify Documentation Template
Edit `semiautomatic_prompt.md` to change:
- Document structure & sections
- Formatting & styling rules
- Business logic inference patterns
- Output naming conventions
### Customize Word Output
Update `Professional_Template.docx` to apply:
- Corporate branding/colors
- Custom font schemes
- Header/footer templates
- Page numbering styles
### Adjust PDF Export
Modify `ConvertToPdf.py` to change:
- Table color schemes
- Code block styling
- Paragraph spacing
- Page size/margins
---
## 🔧 Configuration
### `requirements.txt`
python-docx>=0.8.10 pypandoc>=1.10 pywin32>=305 # Windows only
### Key Variables (`ConvertToPdf.py`)
```python
BASE_DIR = os.path.dirname(os.path.abspath(__file__))
WORD_TEMPLATE = os.path.join(BASE_DIR, "Professional_Template.docx")
TABLE_STYLE = 'Table Grid'
MD_PREFIX = "TechnicalDesign_"
Project: EmailCommunication_DummyReport
| Metric | Value |
|---|---|
| Extraction Time | 45 seconds |
| Documentation Generation | 2 minutes 15 seconds |
| Total Pipeline Time | 3 minutes |
| Document Length | 446 lines |
| Measures Documented | 48 |
| Tables Documented | 13 |
| Relationships Mapped | 10 |
vs. Manual Documentation:
- Estimated Manual Time: 12-14 hours
- Accuracy Gain: Manual inherently 10-15% incomplete
- Update Frequency: Manual = never; Automated = on-demand
We'd love your feedback and contributions!
Found a bug or edge case? Open an issue with:
- Error message & traceback
- Input file (or sample)
- Expected vs actual output
Ideas for improvements?
- Auto-generate relationship diagrams (Mermaid/Visio)
- Measure completeness validation
- Multi-language support (ES/EN/FR/DE)
- Git integration (auto-regenerate on commit)
- Version diff reports
- Support for Looker, Tableau, QlikView
Let us know in Discussions what you'd build next.
- Medium Article — Full explanation & case study
- Setup Guide — Step-by-step installation
- API Reference — Python function documentation
- Examples — Sample projects and outputs
MIT License — See LICENSE file for details.
Free to use, modify, and distribute for personal and commercial projects.
- Check FAQ
- Search existing issues
- Start a discussion
- Open an issue with details
- 💼 Business inquiries: [your-email@example.com]
- 🐦 Twitter: [@your-handle]
- 💬 Discord Community: [join link]
Working through this project, you'll understand:
- ✅ Power BI semantic model structure & metadata
- ✅ Markdown → DOCX → PDF conversion pipeline
- ✅ COM automation (Windows-specific)
- ✅ Professional document styling & formatting
- ✅ Batch file processing & automation
- ✅ Python best practices for data extraction
- ✅ Metadata extraction from .pbip projects
- ✅ Professional documentation generation
- ✅ DOCX/PDF export with styling
- ✅ Batch processing support
- 🔄 Auto-generated relationship diagrams
- 🔄 Measure impact analysis
- 🔄 Data lineage tracking
- 🔮 Support for Looker, Tableau, QlikView
- 🔮 Multi-language documentation
- 🔮 Git/DevOps integration
- 🔮 Interactive HTML dashboards
"Here's the complete model documentation" — saves weeks.
Every measure documented, every relationship justified.
When someone leaves, documentation is the single source of truth.
Track what changed between model versions.
Professional docs for business users and executives.
If this project saved you time or helped your team:
⭐ Star the repo — Helps others discover it
🐦 Share on Twitter — Tell your network
💬 Leave feedback — Help us improve
🤝 Contribute — Suggest features, fix bugs
- First time? Start with the Quick Start section
- Lost? Check Project Structure
- Questions? See Support
- Want to contribute? See Contributing
Made with ❤️ for Data Engineers & BI Developers
Last updated: May 2026