-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adoption of pydoit as the main task management solution #110
Conversation
Wow! This is amazing! ❤️ |
Not surprisingly, I believe that! 😄 Feel free to go step by step, do random comments without analysing it completely, or ask seemingly "stupid" questions. The modifications here are so easy and obvious. but at the same time so difficult to see where each modification is coming from. Moreover, handling #109 first, and maybe containers, will significantly simplify this PR. |
a700c21
to
988ab8f
Compare
I rebased on top of master, after the recent enhancements. CI is failing in this PR because several steps expect pydoit to be available in the containers. That will be fixed as soon as this PR is merged and the 'Containers' job is updated once. Alternatively, the first two commits can be cherry-picked first. |
Thank you so much! However, I would like to isolate "core" and "setups" into separate repositories (I think you have proposed that already some time ago 😉). So this PR would be (mainly) something for |
Note that each of the commits in this PR is usable. If you want, we can discuss and merge them one by one, rather that picking everything at once.
The usage of doit is independent of having sources in one or two repositories. In most tasks, I tried not to do substantial modifications, so it's easier for you to see the equivalency. However, if/when this is merged, I would like to do further enhancements such as dissolving
|
As far as I can see there is nothing substantial deleted in this PR (except for the osflow makefile), so doit is just a unified wrapper yet, right? I am ok to merge all of this PR.
Thanks 😅 👍
I have checked out the help target and that would be very helpful, indeed.
That's ok. As mentioned above, I am fine to merge all at once. I need to get familiar with all the new features anyway 😉 Thanks again for putting so much effort in this! 👍 ❤️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm glad to see that you haven't used environment setup tasks. Should sim:VUnit
and sim:Simple
really be subtasks? I'm not so sure. The idea is that doit sim
will run all subtasks, Does that make sense here since they are doing (from what I can tell) to be the same test in two different frameworks?
Overall it looks good. Great job @umarcor!
dodo.py
Outdated
# FIXME: It should we possible to use '--' for separating the args to be passed raw to the action, instead of | ||
# requiring a param and wrapping all the args in a single string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you wrapped doit into a custom script to pull out trailing args and forward everything to DoItMain
this is easily achievable (and I've seen it done before). All tasks would have to take that "extra_args" param.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I guess that extra_args
might be handled as a list and not just and string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on your hint and on https://github.com/pydoit/doit/blob/master/doit/api.py, I implemented umarcor@558ca09. That is not perfect because EXTRA_ARGS is passed as a list to the single string shell commands. However, that is a different issue. Handling --
does work as you suggested. Thanks!
design: str, | ||
top: str, | ||
id: str, | ||
board_srcs: list, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List of what? List[str]
maybe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah... I was lazy to import the typing module, so I just wrote dumb type hints as a placeholder.
The evolution of this source will depend on the timing of the pyIPCMI/pyCLIAbstraction split. This is the reason I want to help implement GHDLAnalyze, GHDLSynth, YosysSynth and NextpnrImpl in that stack.
It will also depend on #131. If this repo is split, half of these pydoit targets would belong to the setups repo.
class Design: | ||
def __init__(self, name: str, vhdl: list, verilog: list = None): | ||
self.Name = name | ||
self.VHDL = vhdl | ||
self.Verilog = verilog |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a dataclass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. However, I do not want to implement yet another dataclass myself. In this PR, I tried to implement very few and rather obvious classes for the non-Python users to most easily understand the benefits of using a language that supports object oriented programming, rather than bash/makefiles.
In practice, Design
, Fileset
and Project
should be imported from the pyEDAProject split from pyIPCMI, from pyCAPI, from edalize, from PyFPGA... that is, anywhere except reinventing the wheel (again).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some common models for EDA tools flows is a good idea. I was investigating something similar for cocotb.
dodo.py
Outdated
yield { | ||
"basename": "ex", | ||
"name": None, | ||
"doc": "Build an example design for all the supported boards", | ||
} | ||
for board in BOARDS: | ||
yield { | ||
"basename": "ex", | ||
"name": board, | ||
"actions": [CmdAction((Example, [], {"board": board}))], | ||
"doc": "Build an example design for board {}".format(board), | ||
"uptodate": [False], | ||
"pos_arg": "posargs", | ||
"params": [ | ||
{ | ||
"name": "design", | ||
"short": "d", | ||
"long": "design", | ||
"type": str, | ||
"default": environ.get("DESIGN", "MinimalBoot"), | ||
"help": "Name of the design", | ||
}, | ||
], | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a benefit to using basename
instead of moving ex
into task_ex
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As commented below, I was experimenting with the flexibility and constraints of pydoit tasks and subtasks.
I wanted the body of basename 'Example' to be the one in basename 'ex'. That is, I wanted to override the default behaviour of "execute all the subtasts". That's why there are three yields in the same task_Example
.
In practice, you are correct, it is equivalent to have the 'ex' task and subtasks moved to a different function, and avoid 2/3 yields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whoops, double posted...
That is some testing/learning of how pydoit works (is expected to be used). In the Example command I wanted the root/base to have a different behaviour than "execute all the subtasks". So Conversely, in the
@ktbarrett, do you know if pydoit supports more than one level of subtasks? I could not find any example about it. |
@umarcor pydoit does not support more than one level of sub-task pydoit/doit#170 (@leftink is a coworker). To me the idea of sub-tasks is for things like regressions where there are multiple tests to run where you would want to run one, or all. Of course VUnit handles that for you already, so it may not be worthwhile. I'm not sure about the capabilities of VUnit, but if it can collect and return the test list, you can dynamically create sub-tasks based on that. That fits to me. |
Thanks!
Agree. VUnit does have a runner, and implements the functionalities internally. In fact, we don't even need to wrap the VUnit entrypoint in pydoit. In fact, that's why I wanted to support However, as @leftink commented, I find there are other use cases where having multiple levels of hierarchy makes sense. For instance, the riscv-arch-tests have multiple test suites and each test suite has multiple tests. Similarly, in GHDL we have several levels in the test suites. Overall, it's interesting to know what grouping patterns are supported by pydoit. |
I am really looking forward to merging this. 👍 |
51d2310
to
076920e
Compare
So, this PR is now all green. However, instead of merging it all at once, I'm thinking it might be desirable to handle it one commit/change at a time. @stnolting, if you agree I can start with that procedure. |
Great to hear! I am really looking forward to this.
That would be nice 😉 |
This is a rather disruptive PR for showcasing the potential benefits of using a Python based orchestration/management solution at the project level, instead of only using makefiles and/or shell scripts.
Context
This topic was discussed both in the context of NEORV32 and in the wider context of any HDL project including software sources and using both open source and vendor EDA tooling.
In NEORV32, we are currently using the following entrypoints, at least:
.github/generate-job-matrix.py
docs/Makefile
sw/example/processor_check/check.sh
sw/example/processor_check/Makefile
sim/ghdl_sim.sh
sim/run.py
riscv-arch-test/run_riscv_arch_test.sh
setups/examples/Makefile
setups/osflow/common.mk
setups/quartus/**/*.tcl
setups/radiant/**/*.rdf
setups/vivado/**/*.tcl
I read my own notes from Open Source Verification Bundle: API | Tool again, and I remembered @Fatsie and @ktbarrett dropping some nice words in gitter.im/hdl/community about pydoit. So, I read about the main features:
That might be exactly what we need to put all the entrypoints above under some umbrella. I read the pydoit: Success Stories too:
The motivations for developing and using pydoit are very similar to why I wrote dbhi/run.
Moreover, while reading the docs I found more similarities between the implementation of pydoit and how I had conceived a possible solution.
Therefore, I decided to give pydoit a try using NEORV32 as a proof of concept.
Requirements
Python
doit
is a Python tool. Therefore, using it for managing project level tasks implies adding Python as an explicit requirement for "using" NEORV32. For now, it is strictly possible to still use most of the makefiles or shell scripts, but the proposal is to slowly replace "all" the plumbing with pythonic alternatives. More on what "pythonic alternative" means below.Having Python as a requirement is not a big problem in CI because GitHub Actions environments do have it pre-installed, and doit is a Python only package (it does not rely on any shared library or compiled artifact). It is also rather common to have Python 3 installed on the system nowadays, and it is available through most package managers (apt, dnf, yum, pacman, etc.). Therefore, I would consider this an acceptable requirement, specially given that NEORV32 does already use VUnit for advanced simulation/testing.
Anyway, we should maybe add a
requirements.txt
file and some comment to the README in order to make it clear to the users.ghcr.io
We are currently executing implementation and VUnit simulation tasks inside containers which do have Python but which don't have doit preinstalled. In this PR a single dockerfile was added, which installs doit on top of the container from hdl/containers used for implementation. Currently, that image is built right before using it, in the same workflow. However, I would suggest using ghcr.io/stnolting/neorv32 for that purpose.
The GitHub Container registry is accessible with the default
github.token
available in CI, therefore, there is no additional setup needed for using it. Moreover images can be pulled without authentication. Therefore, both users and developers can use them locally or in their own CI. This is desirable, because it provides consistency and it allows to more easily find bugs/mismatches by decoupling those from the environment.Nevertheless, note that we have redundancy. Hence, the scripts/entrypoints are tested locally too (e.g. on MSYS2). That helps us ensure that the tasks do not depend on the containers, but they can be used.
@stnolting, if you are ok with this approach, I would create a separated PR to contribute a CI workflow where the "development containers" for NEORV32 are built and pushed to ghcr.io/stnolting/neorv32. Having that merged separatedly will allow having this slightly cleaner.
Wrapping, CLI and pythonism
pydoit is a very interesting project for anyone evaluating alternatives for a seamless migration from make/shell only to some Python based solution. I recommend everyone to have a look at the documentation: pydoit.org/contents. It has lots of examples and every feature is introduced very directly.
The three core concepts/features of pydoit can be summarised using the titles of the sections in the Guide:
Tasks
The main concept in pydoit is that tasks can be either a Python function or a shell command: https://pydoit.org/tasks.html#actions. Python functions can be just regular or can be explicitly written for dealing with the pydoit API. Similarly, shell commands can be executed raw, or can be partially auto-generated using pydoit's features. That is, pydoit wraps calling commands through Popen and handles the repetitive code for different platforms. It allows optionally overriding the options used for calling Popen.
Therefore, the very most important idea to understand is that the main use case for pydoit in the context of HDL projects is to be a wrapper around existing entrypoints. Those can be make, cmake, shell scripts, other python scripts, nodejs, vendor tools... whatever.
With the regard to implementation details, I'm not completely convinced with the NO API design principle. I see the benefits of a dict/string based solution, particularly when composing/manipulating objects by merging/updating them. In that regard, dicts reduce verbosity compared to classes. However, classes and type hints do provide a more robust solution when things go wrong. For instance, it seems possible to define a command as an string and multiple commands as a list of strings. In certain cases, it is possible to provide one command as a list of strings. However, in other cases that fails.
This is not relevant in practice, specially in comparison to make/bash, but it's worth noting for the most experienced Python developers.
Dependencies
pydoit does optionally support specifying dependencies and targets of each task/action: https://pydoit.org/tasks.html#dependencies-targets. By using files as both dependencies and targets, pydoit can be used as a direct replacement for makefiles. However, that is pointless per se (you just use the makefile in first place).
The reason that is useful is the second revelant concept, explained in https://pydoit.org/tasks.html#file-dep-file-dependency:
That is precisely the main feature that HDL projects are missing in make alike solutions, which do require defining dummy targets as a workaround. E.g., we want to say "first do synthesis, then PnR, then generate a bitstream". We do not want to care about the specific filenames involved in each stage. That's something we want to have defined somewhere, and we want to have meaningful errors if something is wrong, but we do not want to call
make my_very_long_and_complex_bitstream_name.bit
.Tasks in pydoit can save computed values (https://pydoit.org/dependencies.html#saving-computed-values) or can get them from previous tasks (https://pydoit.org/dependencies.html#getargs), which are complementary to having dependency files (or tasks) or artifact files (targets).
I did not go deeper into this feature yet, because I wanted this PR to be mostly about wrapping existing entrypoints. However, I think that "redistributing reusable pydoit tasks" can be something we might propose to the community in gitter.im/hdl/community. That is:
GHDLAnalyse
task which we would call for https://github.com/stnolting/neorv32/blob/master/setups/osflow/synthesis.mk#L1-L8. Maybe alsoGHDLElab
,GHDLSim
,GHDLSynth
, etc.YosysSynth
task for https://github.com/stnolting/neorv32/blob/master/setups/osflow/synthesis.mk#L10-L16.NextpnrImpl
for https://github.com/stnolting/neorv32/blob/master/setups/osflow/PnR_Bit.mk#L1-L6.Project*Pack
for https://github.com/stnolting/neorv32/blob/master/setups/osflow/PnR_Bit.mk#L8-L14.Then, allow users to define their own pipelines by composing those tasks, and have doit generate a DAG. Note that I did not yet evaluate the flexibility of pydoit for overriding and dynamically generating the dependencies and, thus, the DAG. However, according to the Success Stories, it's achievable. I did neither consider seriously whether those reusable tasks would be better maintained by each project or have all of them in the same place. That place would obviously not be NEORV32, but maybe edalize, PyFPGA or some other can do that (probably wrapping their own interfaces).
CLI
pydoit seems to be conceived so that all task are visible and callable through a single CLI entrypoint. That is coherent with using it as the main orchestrator/wrapper around other build/run entrypoints. As a result, there is built-in functionality for specifiying the CLI arguments that each task might accept: https://pydoit.org/task_args.html.
It supports short and long argument names, specifying the type, having a
choice
(enumeration), help string, default value, capturing positional arguments, etc. Overall, it's quite complete, it resembles argparse and the information shown bydoit help
is helpful.However, my perception of this area is bittersweet. I am very conditioned because I'm a user of spf13/cobra along with spf13/viper. Those provide a very nice solution for handling CLI, configuration files and environment variable names which are automatically handled as a resolved set of parameters to the developer. However, those are written in golang. Other than that, I am used to pyAttributes, a project by @Paebbels for building CLIs through decorators, which is built on top of argparse. Therefore, I would have liked if tasks in pydoit were not public by default and decorators were used for specifying the CLI parameters. Nevertheless, I believe this is mostly a syntactic issue. The three features I am concerned about are composability of subcommands, help output and forwarding arguments.
Composability of subcommands
pydoit has a very nice solution for defining multiple tasks and/or subtasks in a single function: https://pydoit.org/tasks.html#sub-tasks. That uses the same syntax as a regular simple task, but it's defined with
yield
and it needs aname
(andbasename
) field.In this PR, the sub-task feature is used for defining the
ex
andsim
tasks. So, instead ofdoit Example -b Fomu ...
one can executedoit ex:Fomu ...
. While implementing it, I found two issues:ex
will execute all ofex:Fomu
,ex:OrangeCrab
,ex:UPduino
andex:iCESugar
. I did not find how to overrideex
withExample
, so that executing all of them by mistake is not possible. I think there might be other use cases where overriding the "base" is desirable.Help output
Related to the previous point, the output of
doit list
anddoit help
is limited. It is useful enough for knowing which tasks exists, but it might be significantly improved compared to cobra or pyAttributes. For instance, when printing the help ofex
, it would be desirable to know that it is related to all theex:*
sub-tasks.Forwarding arguments
As a VUnit user and co-maintainer, I use VUnit scripts as an entrypoint for all my simulations and co-simulation testbenches. Therefore, a major concern I have with edalize and similar solutions is that they have some integration, but they don't honor the user's scripts or CLI commands. As a result, I need to explicitly adapt my standalone VUnit scripts in order to reuse them with those tools for synthesis, implementation, etc.
When using pydoit, it is possible to wrap existing entrypoints almost transparently: https://pydoit.org/task_args.html#arguments. That is used in this PR as
doit sim:VUnit -a '--ci-mode -v'
, instead ofpath/to/vunit/run.py --ci-mode -v
. That works but it is not ideal. Instead, it would be desirable to support--
for separating the args for doit from the args for the task. That's common in many tools and it avoids the issues derived from having to quote arguments inside the string.Implementation
The main file where the doit tasks are defined is
dodo.py
, in the root of the repo. I suggest to open that while reading the following list:.github/generate-job-matrix.py
: removed and implemented as a python function, wrapped in task GenerateExamplesJobMatrix.docs/Makefile
: wrapped as-is in task Documentation.Documentation
workflow.sw/example/processor_check/check.sh
: replaced with task BuildAndInstallSoftwareFrameworkTests which calls the same 4 make commands from a list of actions.sw/example/processor_check/Makefile
: task BuildAndInstallCheckSoftware is added for calling this makefile, instead of calling the command written in the workflow YAML files.sim/ghdl_sim.sh
: wrapped as is in sub-task sim:Simple.sim/run.py
: wrapped in sub-tasks sim:VUnit, forwarding args through-a
.riscv-arch-test/run_riscv_arch_test.sh
: wrapped as-is in task RunRISCVArchitectureTests.setups/examples/Makefile
: removed and replaced by task Example and sub-tasks ex:*. More on this below.setups/osflow/common.mk
: untouched, wrapped in function Run (not a task).setups/quartus/**/*.tcl
: untouched.setups/radiant/**/*.rdf
: untouched.setups/vivado/**/*.tcl
: untouched.This is how the CLI looks like:
Hence, except for
setups/examples/Makefile
, almost all other existing entrypoints were wrapped as-is. As commented above, the purpose of this PR is not to reimplement all of those entrypoints, but to showcase the flexibility of pydoit for preserving some, slightly improving others and completely rewriting a few.As discussed in #96, my main motivation for trying this with NEORV32 now was the complexity of
setups/examples/Makefile
and the difficulty for potential contributors to use or try adding support for new boards. Therefore, I spent some time addressing this in a more pythonic way. I created subdirtasks
:project.py
: three Python classes (Project, Filesets and Design), which contain the HDL sources for each design and board. This is a proof of concept. Looking into the future, I would expect this file to be blend with pyCAPI, pyIPCMI, FuseSoc, PyFPGA or any other tool which does already provide "fileset management and orchestration" features. For now, it allows users who want to add some design or support some new board in NEORV32 to mostly focus on this file only.examples.py
: three helper functions, which behave as a middleware between the Project class and the pydoit features/API.Run
: a wrapper aroundsetups/osflow/common.mk
. This illustrates that users might still decide to call the makefiles from osflow, ignoring the Python helpers. It also explains why the now removedsetups/examples/Makefile
had several recursive calls for setting all the parameters.Example
: whileRun
is expected to be generic for any design using NEORV32 and osflow, functionExample
is a wrapper around it, which generates most of the arguments from the data in the Project class, given the names of the target board and example design.GenerateExamplesJobMatrix
: same as the now removed.github/generate-job-matrix.py
. The content it returns is hardcoded for now. However, it can be enhanced for generating it dynamically from the PRJ object.The pydoit tasks and sub-tasks corresponding to those sources are defined in
task_Example
ofdodo.py
. There are severalyield
commands, all of them wrapping functionExample
fromtasks/examples.py
. Thedoit Example
expects arguments--board
and--design
, whiledoit ex:BOARDNAME
expects--design
only. In both cases, positional arguments are the actual osflow make targets (e.g.doit ex:OrangeCrab -d MinimalBoot clean bit svf
).Therefore, compared to the other entrypoints wrapped as-is,
task_Example
andtasks/*.py
illustrate how to remove one makefile (setups/examples/Makefile
) and go all-in with using Python while still using makefiles under the hood.Pythonism
Although I'm not the most proficient Python coder, I tried applying some good practices such as:
Path
for dealing with locations and filenames (doit does support strings and/or Paths indeed).black
for formatting (with line-width set to 120).Together with the string/struct/dictionary manipulation inherent to Python, I hope this makes the point of the benefits that Python based plumbing can provide.
Conclusion
Overall, my perception is that the Python requirement itself is the only relevant concern for adopting pydoit in the context of NEORV32 or other HDL projects involving software and open source EDA tooling. Other than that, not all the features are ideal, but it does provide a seamless solution for wrapping all the existing entrypoints non-intrusively and without giving away any capability. In fact, just wrapping entrypoints as-is does already provide benefits such as
doit list
,doit help
, allowing moving makefiles/scripts without users noticing, type checking, error handling...pydoit feels specially well suited for potentially using FuseSoC/edalize, PyFPGA, tsfpga, Xeda... in this project with minimal burden. While adopting any of those tools as the main solution would constraint the supported backends and the types of tasks they can handle, pydoit allows either very lightweight or tightly coupled integrations.
With regard to remote execution and/or orchestration/distribution of multiple workers in a pool, that seems not to be in the scope of pydoit. That is better supported by Apache Airflow or Google's Cloud Composer. However, that's not in the scope of NEORV32 neither. Therefore, although evalutating those might be interesting for the HDL community, we we will call it for another day 😄.
By the same token, NetworkX might provide better DAG analysis, sorting and reduction than pydoit. That might allow a more fine-grained control of the tasks to be executed, similarly to the
>FNODE
,FNODE>
or>FNODE>
syntax in dbhi/run (implemented using gonum/graph). However, since pydoit can generate graphviz diagrams, it should be possible to import those into NetworkX for analysis. Yet, this is again interesting for the HDL community, but out of scope in NEORV32 for now and for today.Future work
If this proposal is accepted and
pydoit
is used in NEORV32, these are some of the possible enhancements I would like to try (not necessarily in order, and not necessarily in the short term):setups/osflow/fileset.mk
and replace it with hardcoding the same data intasks/project.py
or, maybe, using a*.core
along with reusing it insim/run.py
.setups/osflow/synthesis.mk
,setups/osflow/PnR_Bit.mk
andsetups/osflow/tools.mk
.setups/osflow/boards/*
andsetups/osflow/constraints/*
with submoduling hdl/constraints (after adding some minimal python plumbing there).common.mk
.sim/ghdl_sim.sh
and replace it with a sequence of pydoit actions (or a sequence of tasks if GHDLImport, GHDLMake and GHDLSim are implemented).docs/Makefile
with a pydoit task.