Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #588 +/- ##
==========================================
- Coverage 89.47% 84.79% -4.69%
==========================================
Files 37 37
Lines 3544 3301 -243
Branches 3544 3301 -243
==========================================
- Hits 3171 2799 -372
- Misses 179 313 +134
+ Partials 194 189 -5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull Request Overview
This PR documents the input file format by generating markdown tables from YAML table schemas. Key changes include the addition of new YAML schema files for various input files, improvements to the documentation generation script, and updates to the GitHub action to integrate the new documentation process.
Reviewed Changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| schemas/input/regions.yaml | Added basic region schema for defining regions. |
| schemas/input/processes.yaml | Added main processes schema with field definitions. |
| schemas/input/process_parameters.yaml | Added process parameters schema with details on process inputs. |
| schemas/input/process_flows.yaml | Added schema for commodity flows of each process. |
| schemas/input/process_availabilities.yaml | Added schema for process availabilities. |
| schemas/input/demand_slicing.yaml | Added schema for annual demand slicing details. |
| schemas/input/demand.yaml | Added schema for service demand commodity entries. |
| schemas/input/commodity_costs.yaml | Added schema for commodity cost definitions. |
| schemas/input/commodities.yaml | Added schema for commodities. |
| schemas/input/assets.yaml | Added schema for asset definitions. |
| schemas/input/agents.yaml | Added schema for agent definitions. |
| schemas/input/agent_search_space.yaml | Added schema for agent search space definitions. |
| schemas/input/agent_objectives.yaml | Added schema for agent objectives with decision rule details. |
| schemas/input/agent_cost_limits.yaml | Added schema for agent cost limits. |
| schemas/input/agent_commodity_portions.yaml | Added schema for commodity demand portions per agent. |
| docs/input_format.md | New documentation page for the input format. |
| docs/generate_input_format_doc.py | Script to auto-generate input format documentation from schemas. |
| docs/SUMMARY.md | Updated to include the Input Format documentation link. |
| doc-requirements.txt | Added table2md dependency requirement. |
| .github/actions/generate-docs/action.yml | Updated GitHub action to install deps and generate input docs. |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
I managed to create Maybe a Windows thing? Also, we should either add this file to gitignore or commit it - what do you think? |
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| print(generate_markdown(), end="") |
There was a problem hiding this comment.
| print(generate_markdown(), end="") | |
| output_path = _DOCS_DIR / "input_format.md" | |
| output_path.write_text(generate_markdown(), encoding="utf-8") |
Any reason not to do this? (fixes the utf-8 problem for me on Windows, and simpler just to run python generate_input_format_doc.py)
There was a problem hiding this comment.
Not really. I think that's a cleaner way of doing things.
| with path.open() as f: | ||
| data = yaml.safe_load(f) | ||
|
|
||
| info = data["title"] |
There was a problem hiding this comment.
The "{title}. {description}" format read really weirdly for most files. e.g.
"Commodity demand portions for agents. Portions of commodity demand for which agents are responsible."
I would say we don't need a title field for the files, just go straight to the description
There was a problem hiding this comment.
Actually, I would split this into "Description" and "Notes", as also suggested for the individual fields, and put the notes underneath the table
tsmbland
left a comment
There was a problem hiding this comment.
I think we can refine this as we go along, but as a starting point this is awesome. Good job!
| def fields2table(fields: list[dict[str, str]]) -> str: | ||
| data = [ | ||
| { | ||
| "Field": f"`{f['name']}`", |
There was a problem hiding this comment.
I would change the column titles to "Field", "Description" and "Notes"
| with path.open() as f: | ||
| data = yaml.safe_load(f) | ||
|
|
||
| info = data["title"] |
There was a problem hiding this comment.
Actually, I would split this into "Description" and "Notes", as also suggested for the individual fields, and put the notes underneath the table
| description: | | ||
| Defines processes for the system. | ||
|
|
||
| Every SED (supply equals demand) commodity must have both producer and consumer processes for |
There was a problem hiding this comment.
Comments like this are hard to know where to place, as they involve multiple tables. I would say this probably makes more sense in the commodities file, but I can see why you would include it here. Alternatively, we could have a separate file documenting global validation checks. What do you think?
There was a problem hiding this comment.
That's a good point. I put it here kind of mindlessly because the relevant check is in with process-related code. I think we should probably just move it to commodities.yaml.
We might want to have somewhere to mention global checks. I'd like to turn the document into a jinja template at some point and then that'll make adding a preamble with info about global checks in a bit easier.
Ah, yes. It's because the default file encoding for Python on Windows isn't UTF-8 for historical reasons (which is a constant source of annoyance). Maybe we should just write directly to the file, as you suggest.
I'm thinking the gitignore route would be cleaner, otherwise the committed file will be perpetually out of date. I stuck a placeholder file in |
Description
I started working on #530 and realised it probably made more sense to just document the input format rather than attempting to describe each of the (many) small checks we're doing in prose form, so that's what I've done here.
Rather than manually writing a bunch of markdown tables, I thought it would be easier to generate them from some source files. I've used the table schema format to document the CSV files, which is published by the people who make the
frictionlessPython framework. The nice thing about using schemas is that we could also use them to validate the data, which would tell us whether we've forgotten to document any fields or if the types have changed etc. (NB: This would be purely for documentation -- the Rust code already validates the input data perfectly well.)There wasn't a tool to produce documentation from table schemas already, so I knocked together a script. It's a bit rough around the edges -- in retrospect, I wish I'd used
jinjawith a template instead -- but I figure it's probably fine for now. We can reuse it with some tweaks when we come to documenting the output format (#529).I haven't documented
model.tomlyet, but we could adopt a similar approach there.frictionlessdoesn't support TOML directly, but we could just use a JSON schema, which is more or less the same thing.Closes #530.
Type of change
Key checklist
$ cargo test$ cargo docFurther checks