-
Notifications
You must be signed in to change notification settings - Fork 54
LCORE-581: refactored gendoc script #449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LCORE-581: refactored gendoc script #449
Conversation
WalkthroughRefactors scripts/gen_doc.py by extracting per-directory README generation into generate_docfile(directory) and generate_documentation_on_path(path); main() now iterates Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as CLI/User
participant M as main()
participant FS as Path("src").rglob("*")
participant G as generate_documentation_on_path(path)
participant D as generate_docfile(directory)
participant IO as Filesystem
U->>M: invoke script
M->>FS: enumerate src/ and subdirectories
loop per directory
M->>G: call generate_documentation_on_path(dir)
rect rgba(220,235,255,0.4)
note right of G: save current CWD
G->>IO: chdir(dir)
G->>D: call generate_docfile(dir)
D->>IO: scan *.py and parse AST for first docstring line
D->>IO: write README.md in dir
G-->>IO: finally: restore previous CWD
end
end
M-->>U: exit
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
scripts/gen_doc.py (4)
11-34: Prefer zero side-effects: avoid chdir and operate on Path objects end-to-end.Repeatedly changing CWD is fragile (harder to reuse from other scripts, risks surprising interactions, complicates error handling). Since you already pass the target path around, you can generate README.md under that directory without chdir. This also nicely decouples generate_docfile from global state.
Apply this diff to eliminate chdir and make both functions path-driven:
def generate_docfile(directory): - """Generate README.md in the CWD.""" - with open("README.md", "w", encoding="utf-8", newline="\n") as indexfile: - print( - f"# List of source files stored in `{directory}` directory", - file=indexfile, - ) - print("", file=indexfile) - files = sorted(os.listdir()) - - for file in files: - if file.endswith(".py"): - print(f"## [{file}]({file})", file=indexfile) - with open(file, "r", encoding="utf-8") as fin: - source = fin.read() - try: - mod = ast.parse(source) - doc = ast.get_docstring(mod) - except SyntaxError: - doc = None - if doc: - print(doc.splitlines()[0], file=indexfile) - print(file=indexfile) + """Generate README.md under `directory` (no global CWD changes).** + """ + # Local import to avoid touching top-level imports; also safe for reuse. + import tokenize + dir_path = Path(directory) + readme_path = dir_path / "README.md" + with open(readme_path, "w", encoding="utf-8", newline="\n") as indexfile: + print( + f"# List of source files stored in `{dir_path.as_posix()}` directory", + file=indexfile, + ) + print("", file=indexfile) + for py_path in sorted(dir_path.glob("*.py")): + print(f"## [{py_path.name}]({py_path.name})", file=indexfile) + try: + with tokenize.open(py_path) as fin: + source = fin.read() + mod = ast.parse(source) + doc = ast.get_docstring(mod) + except (SyntaxError, UnicodeDecodeError): + doc = None + if doc: + print(doc.splitlines()[0], file=indexfile) + print(file=indexfile) def generate_documentation_on_path(path): - """Generate documentation for all the sources found in path.""" - directory = path - cwd = os.getcwd() - os.chdir(directory) - print(directory) - - try: - generate_docfile(directory) - finally: - os.chdir(cwd) + """Generate documentation for all the sources found in `path`.""" + print(path) + generate_docfile(path)Also applies to: 36-47
21-23: Minor: rename loop variable fromfiletofilenamefor clarity.Small readability tweak; avoids shadowing a historically common name and clarifies intent.
Apply this tiny diff:
- for file in files: - if file.endswith(".py"): - print(f"## [{file}]({file})", file=indexfile) + for filename in files: + if filename.endswith(".py"): + print(f"## [{filename}]({filename})", file=indexfile)
41-41: Nit: make progress output more informative (or toggle via a verbosity flag).Plain
print(directory)is terse. Consider a prefix or using logging for better UX.Example minimal change:
- print(directory) + print(f"[gendoc] Generating README.md in: {directory}")
49-55: Make execution robust to current working directory by anchoring src relative to this script.If someone runs the script from outside the repo root, Path("src") may not resolve. Deriving src relative to this file avoids that.
Apply:
def main(): - """Entry point to this script, regenerates documentation in all directories.""" - generate_documentation_on_path("src/") - for path in Path("src").rglob("*"): + """Entry point to this script, regenerates documentation in all directories.""" + repo_root = Path(__file__).resolve().parents[1] # repo root (parent of scripts/) + src_dir = repo_root / "src" + generate_documentation_on_path(src_dir) + for path in src_dir.rglob("*"): if path.is_dir(): generate_documentation_on_path(path)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
scripts/gen_doc.py(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (2)
scripts/gen_doc.py (2)
11-17: LGTM: Clear modularization and safe CWD restoration.Nice refactor extracting per-directory logic into functions and guarding CWD changes with a finally block. This improves readability and testability of the script.
Also applies to: 36-47, 49-55
24-30: Manual verification required: confirm encoding compliance for Python sourcesI ran the suggested script to detect any Python files declaring a non-UTF-8/UTF8 coding cookie, but it produced no output. Absence of matches may simply mean there are no such declarations, but it’s not a definitive guarantee.
Please manually verify that:
- No Python files in the repository declare a different source encoding via a
# coding:or# -*- coding: -*-header.- All files you intend to parse are indeed UTF-8 (the default in Python 3) or otherwise safe to read with
encoding="utf-8".If you do have any non-UTF-8 files, consider updating the snippet to use
tokenize.open, which respects PEP 263:- with open(file, "r", encoding="utf-8") as fin: - source = fin.read() + # Respect source coding cookie (PEP 263) + import tokenize + with tokenize.open(file) as fin: + source = fin.read()
247f1e2 to
f85cdfd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (7)
src/README.md (7)
1-2: Add an auto-generated notice at the top.Since this README is produced by scripts/gen_doc.py, add a header to prevent accidental manual edits and reduce churn in PRs.
Apply this diff:
+<!-- This file is auto-generated by scripts/gen_doc.py. Do not edit manually. --> + # List of source files stored in `src/` directory
4-4: Fix “Lightspeed-stack” hyphenation.Use “Lightspeed stack” consistently.
-Main classes for the Lightspeed-stack. +Main classes for the Lightspeed stack.-Lightspeed stack. +Lightspeed stack.Also applies to: 16-16
7-7: Correct “LLama” capitalization and tighten phrasing.“Llama” is the common styling; “retrieval” is awkward here.
-LLama stack client retrieval. +Llama stack client.
19-19: Prefer “Logging utilities.”Minor wording improvement.
-Log utilities. +Logging utilities.
21-22: Clarify what reads the version.“Project manager tools” is ambiguous. Suggest phrasing that matches typical build/packaging usage.
-Service version that is read by project manager tools. +Service version string exposed for build and packaging tools.
1-24: Option: note generation timestamp and source path.If the generator supports it, embedding the generation date and source path aids traceability in downstream docs.
Example to have gen_doc.py prepend:
<!-- Generated by scripts/gen_doc.py on 2025-08-26 from src/*.py -->
24-24: Ensure a single trailing newline.Minor formatting nit: keep exactly one newline at EOF to satisfy common linters.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
scripts/gen_doc.py(1 hunks)src/README.md(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- scripts/gen_doc.py
🧰 Additional context used
🪛 LanguageTool
src/README.md
[grammar] ~3-~3: There might be a mistake here.
Context: ...red in src/ directory ## init.py Main classes for the Lightspeed-stack. ...
(QB_NEW_EN)
[grammar] ~6-~6: There might be a mistake here.
Context: ...for the Lightspeed-stack. ## client.py LLama stack client retrieval. ## [confi...
(QB_NEW_EN)
[grammar] ~9-~9: There might be a mistake here.
Context: ... client retrieval. ## configuration.py Configuration loader. ## [constants.py]...
(QB_NEW_EN)
[grammar] ~12-~12: There might be a mistake here.
Context: ... Configuration loader. ## constants.py Constants used in business logic. ## [l...
(QB_NEW_EN)
[grammar] ~15-~15: There might be a mistake here.
Context: ...business logic. ## lightspeed_stack.py Lightspeed stack. ## log.py L...
(QB_NEW_EN)
[grammar] ~18-~18: There might be a mistake here.
Context: ..._stack.py) Lightspeed stack. ## log.py Log utilities. ## [version.py](version....
(QB_NEW_EN)
[grammar] ~21-~21: There might be a mistake here.
Context: ...](log.py) Log utilities. ## version.py Service version that is read by project ...
(QB_NEW_EN)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-pr
- GitHub Check: e2e_tests
🔇 Additional comments (1)
src/README.md (1)
3-23: All README links verifiedRan the provided script against src/README.md; all relative links resolve to existing files. No broken links detected—ready to merge.
Description
LCORE-581: refactored gendoc script
Type of change
Related Tickets & Documents
Summary by CodeRabbit