Skip to content

Improve script and test output. Build using pyproject.tml with setuptools_scm#92

Merged
MichaelClerx merged 24 commits intomainfrom
js_new_directory_builder
Nov 5, 2025
Merged

Improve script and test output. Build using pyproject.tml with setuptools_scm#92
MichaelClerx merged 24 commits intomainfrom
js_new_directory_builder

Conversation

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator

@joeyshuttleworth joeyshuttleworth commented Oct 31, 2025

Description

Move build system over to a pyproject.toml file. https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#writing-pyproject-toml

Adds a new submodule for generating output directories, directory_builder. This function also creates a "pcpostprocess_info.txt" file which lists the date/time run, pcpostprocess version info and the relevant git commit. This is now used in all scripts and tests (addresses issue #63.). This is handled through setuptools-scm using the new pyproject.toml.

Also tidy up some of the output from run_herg_qc:

  • Run qc3bookend only for selected wells
  • Ensure qc6 debug plots are placed in the right folder
  • Ensure all QC debug plots are placed in the right folder (one for each well/protocol)
  • Ensure leak correction plots are kept in folders, one inside the QC folder, and one at the top level

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Testing

  • Testing is done automatically and codecov shows test coverage
  • This cannot be tested automatically

Documentation checklist

  • I have updated all documentation in the code where necessary.
  • I have checked spelling in all (new) comments and documentation.
  • I have added a note to RELEASE.md if relevant (new feature, breaking change, or notable bug fix).

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

Missing nice docstrings on the new functions

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

Not tested on Mac. But I think the subprocess git call should work.

I suppose this means that command-line git is a requirement.

@joeyshuttleworth joeyshuttleworth changed the title [WIP] Add a new "directory builder" which outputs version info Add a new "directory builder" which outputs version info Nov 3, 2025
Copy link
Copy Markdown
Member

@MichaelClerx MichaelClerx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have thought a version number should do?
E.g. some variables somewhere version_str and release=False or similar. Don't need to mix git hashes in, do we? I'd want people using it seriously to use a released version, not the github one?

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

Yeah, it should do if people are using the actual releases. But it might be useful for us if we're using it constantly between releases. The only downside I can see is possible portability issues if people don't have git. So maybe we just need to handle that case.

Another thing we could do is check for uncommitted changes with git diff-index and add "clean" or "dirty" to the commit hash

@MichaelClerx
Copy link
Copy Markdown
Member

I really don't think we should be sending people data processed by software that has uncommitted changes!
Surely when we use this on anyone's data, we want to use an agreed-upon and released version?

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

Yes, you're right. I'm mostly thinking of the files I have on my own computer. If I have a fully processed dataset sat somewhere for a while, it's nice to know exactly what code was used to produce it - even if I only want pull up a graph or two for my own purposes. I feel like we should include all of this information in the file in case it's useful. I don't see what we'd gain by leaving it out.

If we don't want to anyone to receive data from uncommitted changes, then it's probably a good thing to be able to see if a clean/dirty worktree was used. I see this as a sort of safe guard to someone messing up and using the wrong branch or something.

The original issue asked for a git hash, but we can leave it out if everyone thinks it's unnecessary.

@MichaelClerx
Copy link
Copy Markdown
Member

We'd gain easier to maintain code that doesn't rely on subprocess calls to git (which we can't even install as a dependency)

I realise it asks for a commit hash in the original issue, but I don't think that was a good idea. I think we could quite easily get what we need by having it output either the version number (for released versions), or an all-bets-are-off warning message if the release flag is set to False

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

We can handle the cases where git's missing. Thinking about this again, the biggest problem is probably that the git history isn't necessarily available when it's installed through PyPI/pip.

Also, this code will won't return the right git hash when pcpostprocess is imported into a different module.

@joeyshuttleworth joeyshuttleworth changed the title Add a new "directory builder" which outputs version info Improve script and test output. Build using pyproject.tml with setuptools_scm Nov 4, 2025
Comment thread pyproject.toml Outdated
@MichaelClerx
Copy link
Copy Markdown
Member

OK This is looking better and I like having a dev dependency rather than a user dependency!

But what's the rationale for the directory builder stuff? Why make the test output go somewhere else with every commit?

Comment thread pyproject.toml Outdated
@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

joeyshuttleworth commented Nov 5, 2025

OK This is looking better and I like having a dev dependency rather than a user dependency!

But what's the rationale for the directory builder stuff? Why make the test output go somewhere else with every commit?

It's supposed to go to the same place. It's just the content of the pcpostprocess_info.txt that changes.

The rational is to ensure that any generated output has version information and other relevant details (commands, data/time, etc..). Each script also has a unique subdirectory name so they won't overwrite each other if the user runs two scripts pointing at the same output directory

@MichaelClerx
Copy link
Copy Markdown
Member

But why are you doing that in test_herg_qc, test_leak_correct, and test_subtraction_plots?

@MichaelClerx
Copy link
Copy Markdown
Member

I don't want them to generate any output, to be honest

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

It's a convenient way to generate some example plots and to manually check that nothing is broken with them.

I think it's important to run fig.savefig() because some matplotlib errors only come up when you try to actually save the plot. If we don't want to save it permanently, we can use tempfile to not create any output (depending on a flag maybe).

@mirams and @frankiepe were looking at these test-generated plots the other day, so might be worth keeping them.

@mirams
Copy link
Copy Markdown
Member

mirams commented Nov 5, 2025

It is nice to have a test to generate plots, I used it recently because I wanted to edit the style of everyone's plots a bit. I don't mind too much if that is uncommenting a .show() line, but as Joey said I guess with PDF/png output and things it can be nicest to make sure it all comes out right and we haven't relied on alpha things that aren't working or whatever.

@MichaelClerx
Copy link
Copy Markdown
Member

OK two points here:

  1. Sometimes it's nice to store test output. Agreed! (And also yes you can use a temporary directory to test savefig calls). But you also want to be able to pack tests with the pypi package, so that users can run them - and they won't want any output. I usually make each test file callable, and if it's called manually instead of by unittest then I let it do "debug" stuff e.g. make plots. Would that be an answer?

  2. Regardless of the above, we don't want the test output to be in a different location for every new commit!

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

1. Sometimes it's nice to store test output. Agreed! (And also yes you can use a temporary directory to test savefig calls). But you also want to be able to pack tests with the pypi package, so that users can run them - and they won't want any output. I usually make each test file callable, and if it's called manually instead of by `unittest` then I let it do "debug" stuff e.g. make plots. Would that be an answer?

Could do. But I think it might be simpler to pass in an environment variable or command line argument to turn on output generation. Seems like the command-line arg approach is possible with pytest but is a bit more involved with unittest.

2. Regardless of the above, we don't want the test output to be in a different location for every new commit!

I think the current code doesn't output to a new location. I had some UUID things in where I had copied it over from a previous project, but I took them out (I hope I did anyway!).

@MichaelClerx
Copy link
Copy Markdown
Member

You don't actually need to go through the unittest or pytest modules at all! See e.g. this example we use this in Myokit and PINTS all the time. Usually the code is just

if __name__ == '__main__':
    unittest.main()

but as shown in this example you can do all sorts of things like enabling plotting/debugging, messing with logging etc.
A major advantage is you can just call python tests/my_test.py instead of having to know the complicated unittest/pytest syntax and all their environment variable mush

Comment thread tests/test_herg_qc.py Outdated

plot_dir = "test_output"
os.makedirs(plot_dir, exist_ok=True)
plot_dir = directory_builder.setup_output_directory("test_output",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines, @joeyshuttleworth ! If the above works just as well, then let's stick with that! No need to make the tests use a function only the scripts use. The less stuff to untangle the better

Comment thread tests/test_leak_correct.py Outdated

if not os.path.exists(self.output_dir):
os.makedirs(self.output_dir)
self.output_dir = directory_builder.setup_output_directory("test_output",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here and below

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

Yeah, we could run the tests like that. But you would lose the convenience of running everything using pytest or unittest.

Removing the directory_builder code from the tests is good practice because it uncouples them, so we should do that.

@MichaelClerx
Copy link
Copy Markdown
Member

Yeah, we could run the tests like that. But you would lose the convenience of running everything using pytest or unittest.

Nonono, for individual tests you don't go through unittest
For all tests you use unittest

@MichaelClerx
Copy link
Copy Markdown
Member

So if you want to run all tests locally, or in actions, you do e.g.

python -m unittest

But if you want to run a single test, you don't have to do

python -m unittest NameOfMyTestClass.somemethod

or whatever the syntax is, you can just do

python test/my_test.py

which also uses unittest, internally, but lets you just add any command line args or anything you want without having to do a deep dive into the unittest framework

@joeyshuttleworth
Copy link
Copy Markdown
Collaborator Author

Yeah, that's nice but if you use pytest you can get it to output everything at once which is convenient. Otherwise you have to run every test manually

Joseph G. Shuttleworth added 2 commits November 5, 2025 12:56
@MichaelClerx MichaelClerx self-requested a review November 5, 2025 13:17
Copy link
Copy Markdown
Member

@MichaelClerx MichaelClerx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @joeyshuttleworth , looks good

@MichaelClerx MichaelClerx merged commit 3ef4b0b into main Nov 5, 2025
6 checks passed
@MichaelClerx MichaelClerx deleted the js_new_directory_builder branch November 5, 2025 13:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add pcpostprocess version AND (if possible) commit hash to generated stuff

3 participants