Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

**kwargs input and anesthetic.NestedSamples output #90

Merged
merged 95 commits into from
Jan 29, 2024

Conversation

AdamOrmondroyd
Copy link
Contributor

@AdamOrmondroyd AdamOrmondroyd commented Jul 27, 2022

I've added a new interface pypolychord.run() which uses **kwargs rather than the PolyChordSettings object. It also returns an anesthetic.NestedSamples object rather than PolyChordOutput if anesthetic is already installed, otherwise it returns None.

The examples have been updated to use this interface.

Edit: run_pypolychord -> run_polychord. I'm glad the new interface uses just run!

Copy link
Member

@williamjameshandley williamjameshandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for this @Ormorod, a more pythonic interface will likely definitely be useful, and the new kwargs treatment looks excellent (similar in spirit to #47)

However, as discussed via email we cannot change the run_pypolychord interface without breaking a lot of other people's code. You should write a new additional interface, e.g. run, which could be imported as from pypolychord import run, and leave run_pypolychord as-is.

You should also revert all the Python_Functions changes which can be kept as the legacy interface, particularly whilst we are stabilising the new interface. For example, there's no reason why nDerived, prior, and dumper can't be moved to *args or **kwargs

@AdamOrmondroyd
Copy link
Contributor Author

Many thanks for this @Ormorod, a more pythonic interface will likely definitely be useful, and the new kwargs treatment looks excellent (similar in spirit to #47)

However, as discussed via email we cannot change the run_pypolychord interface without breaking a lot of other people's code. You should write a new additional interface, e.g. run, which could be imported as from pypolychord import run, and leave run_pypolychord as-is.

Ah yes I'd intended to put the old run_polychord() back in, temporarily removing it made the diff much more helpful.
btw the function doesn't have "py" in it, I've made this mistake many times.

You should also revert all the Python_Functions changes which can be kept as the legacy interface, particularly whilst we are stabilising the new interface. For example, there's no reason why nDerived, prior, and dumper can't be moved to *args or **kwargs

Okay, I just changed them all so I could test my changes. Would you like to keep the changes to run_pypolychord.py/.ipynb?

Copy link
Member

@williamjameshandley williamjameshandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work. The kwargs interface will be much more robust and future-proof.

In the new interface we should assume an anesthetic install, as it is now mature enough, and we should refine the example so that it is as pythonic and streamlined as possible, whilst showcasing what we can do (e.g. putting an anesthetic gui at the end).

Could we keep a copy of the old run_polychord.py, renamed to run_polychord_legacy_interface.py?

Comment on lines 46 to 55
kwargs = {
"nDerived": nDerived,
"prior": prior,
"dumper": dumper,
"file_root": 'gaussian',
"nlive": 200,
"do_clustering": True,
"read_resume": False,
"paramnames": paramnames,
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we pass these as keywords to run explicitly, rather than a dictionary?

run_pypolychord.py Outdated Show resolved Hide resolved
pypolychord/polychord.py Outdated Show resolved Hide resolved
pypolychord/polychord.py Outdated Show resolved Hide resolved
pypolychord/polychord.py Outdated Show resolved Hide resolved
run_pypolychord.py Outdated Show resolved Hide resolved
run_pypolychord.ipynb Outdated Show resolved Hide resolved
@AdamOrmondroyd
Copy link
Contributor Author

@williamjameshandley I've taken the opportunity to add a few more test reflecting some of the recent PRs using the new interface, lmk if you think this should be its own PR

pypolychord/polychord.py Outdated Show resolved Hide resolved
pypolychord/polychord.py Outdated Show resolved Hide resolved
run_pypolychord.py Outdated Show resolved Hide resolved
run_pypolychord.ipynb Outdated Show resolved Hide resolved
@@ -75,7 +72,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install mpi4py pytest pytest-mpi fortranformat
python -m pip install mpi4py pytest pytest-mpi fortranformat anesthetic
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might not actually need pytest-mpi (which allows us to turn tests on or off depending on whether the test is mpirun) - do you see us ever wanting to test something on serial mode only?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll always want to test mpi, so leave this in

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can test with mpi without pytest-mpi, we only need it if we want to have tests which run specifically only with or without mpi

Copy link
Member

@williamjameshandley williamjameshandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking very nice, many thanks @AdamOrmondroyd (and @appetrosyan for comments on interfacing over the years)

a few adjustments to consider/discuss inline

pypolychord/polychord.py Outdated Show resolved Hide resolved
'nlives': {},
'seed': -1,
}
default_kwargs['grade_frac'] = [1.0]*len(default_kwargs['grade_dims'])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this line add, which 'grade_frac': [1.0] wouldn't?

Copy link
Contributor Author

@AdamOrmondroyd AdamOrmondroyd Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grade_frac needs to have the same length as grade_dims (copied from the original interface, and forgetting to do this just failed the tests - glad we have them now!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realised that this isn't quite right, since the default grade_frac should the same length as the input grade_dims, not the default grade_dims

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in cdaf063

Comment on lines 557 to 560
if not set(kwargs.keys()) <= set(default_kwargs.keys()):
raise TypeError(f"{__name__} got unknown keyword arguments {kwargs.keys() - default_kwargs.keys()}")
default_kwargs.update(kwargs)
kwargs = default_kwargs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is neat!

Comment on lines 562 to 572
try:
if rank == 0:
os.makedirs(kwargs['base_dir'])
except OSError:
pass

try:
if rank == 0:
os.makedirs(os.path.join(kwargs['base_dir'], 'clusters'))
except OSError:
pass

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these be combined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think no, as I believe the OSError expected here is that the directory already exists, and it may be the case that base_dir already exists, but base_dir/clusters does not.

I can do something neater with pathlib.Path, which can create several layers of directories at once, and can be told not to worry if the directory already exists

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, I've also converted the other uses of os to pathlib

quickstart.py Outdated Show resolved Hide resolved
quickstart.py Outdated Show resolved Hide resolved
@@ -75,7 +72,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install mpi4py pytest pytest-mpi fortranformat
python -m pip install mpi4py pytest pytest-mpi fortranformat anesthetic

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll always want to test mpi, so leave this in

README.rst Show resolved Hide resolved
pypolychord/polychord.py Show resolved Hide resolved

def _legacy_make_resume_file(settings, loglikelihood, prior):
kwargs = settings.__dict__
_make_resume_file(loglikelihood, prior = prior, **kwargs)
Copy link
Contributor Author

@AdamOrmondroyd AdamOrmondroyd Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realised I could dispense with the duplication between run and the old run_polychord with the same approach as here. @williamjameshandley what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@williamjameshandley could do with looking at this before I completely forget what I was thinking

@@ -75,7 +72,7 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
python -m pip install mpi4py pytest pytest-mpi fortranformat
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe these should be added to requirements.txt (or requirements_test.txt)?

@AdamOrmondroyd AdamOrmondroyd changed the title anesthetic.NestedSamples output and **kwargs input **kwargs input and anesthetic.NestedSamples output Jan 25, 2024
Comment on lines +154 to +155
if rank == 0:
Path(settings.cluster_dir).mkdir(parents=True, exist_ok=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

Copy link
Member

@williamjameshandley williamjameshandley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good -- not sure I like import anesthetic as ac, but if you're happy please squash and merge.

quickstart.py Outdated
Comment on lines 65 to 66
import anesthetic as ac
fig, ax = ac.make_2d_axes(['p0', 'p1', 'p2', 'p3', 'r'])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this to from anesthetic import make_2d_axes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of cluttering the namespace personally, and I thought that Lukas and you liked ac as a shorthand for anesthetic!

I'll settle for import anesthetic then anesthetic.make_2d_axes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also imported numpy rather than clutter up the namespace with log and pi. Personally I've always found this annoying when I want to just substitute in a simple likelihood to check I haven't done anything daft, and copy-pasting the one from run_pypolychord.py is my go-to, but then I have to change log->np.log etc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’ll weigh in on the discussion.

This is about consistency. If most of the project uses a “cluttered” namespace, it would be very distracting for users and other engineers.

The “cluttering” of a namespace makes it easy to substitute different versions of similar API elements: constants, variables, functions etc.

Consider if you needed to replace numpy.pi with scipy.pi. We’re talking a single line change in the imports compared to every single instance having to be renamed.

Another huge benefit IMO is readability. The namespace is not something the programmer sees. By contrast every place numpy functions or constants are used is by definition visible to the programmer. If there is no ambiguity, direct imports communicate more and make the program shorter. While sys.os functions may be harder to parse if imported directly, numpy and scipy functions are by definition known to the user.

Copy link
Contributor Author

@AdamOrmondroyd AdamOrmondroyd Jan 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the convenience of substituting similar api elements, however iirc with anesthetic we were once caught out by the differences between the numpy and scipy Cholesky decompositions when we substituted one for the other

I understand what you mean about readability, but then it's also sometimes nice to be reminded where a function came from without going back to the preamble

However I will choose my battles and for the sake of finally seeing the back of this PR, I've done as Will originally suggested

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I will choose my battles and for the sake of finally seeing the back of this PR, I've done as Will originally suggested

Good choice.

That's usually the way it goes when you're contributing. The maintainers set style and convention, and in the few situations where I've seen maintainers change their mind about either of those things, it was a separate discussion that came with its own PR.

I understand what you mean about readability, but then it's also sometimes nice to be reminded where a function came from without going back to the preamble

Very much up to the maintainer. It's a question of style, and each style has trade-offs. On that note, If I were curious about where a function came from, most all code editors (Emacs, Vim etc.) have a facility that will display the documentation for a symbol if you hover. There's also code lenses that will optionally qualify all functions regardless of the actual text of the program.

I can see the convenience of substituting similar api elements, however iirc with anesthetic we were once caught out by the differences between the numpy and scipy Cholesky decompositions when we substituted one for the other

Sure. But that isn't exactly an argument for qualifying the paths.

First, as a result of that debacle, you know that the two are not the same, and that you do need to disambiguate, this is valuable information if you plan to contribute/co-maintain a project.

Secondly, there's more than one way to disambiguate; only one way to do it is to qualify the paths. Another way to do it, and I'll argue better, is to disambiguate by intent: np.choleskyupper_triangular, and scipylower_triangular. Now your code communicates information to the user, without them having to dig through documentation to find out that the difference is in the defaults.

@AdamOrmondroyd AdamOrmondroyd merged commit b3daab9 into PolyChord:master Jan 29, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants