Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theory citation function #123

Merged
merged 43 commits into from Mar 6, 2021
Merged

Conversation

MJKirk
Copy link
Contributor

@MJKirk MJKirk commented Oct 15, 2020

A start on the theory citation function as discussed in #120 - just the basics so far, not to be merged yet.

@coveralls
Copy link

coveralls commented Oct 15, 2020

Coverage Status

Coverage increased (+0.02%) to 93.886% when pulling 13f3946 on MJKirk:theory-citation-function into 7e0502c on flav-io:master.

@MJKirk
Copy link
Contributor Author

MJKirk commented Oct 23, 2020

One issue that has occurred to me:
In some cases, a lot of the "citation dependence" for a theory prediction lives in the parameters. E.g for an SM prediction of meson mixing, I would normally cite the Inami-Lim loop function, the QCD RG factors eta, the CKM elements, and the decay constants and bag parameters - in flavio the loop function is done in code (and now cited in MJKirk@8be3df2), but all the others are just parameters. I would imagine this is true for many other observables.
I'm not sure if there's a simple way of dealing with this - I guess what's needed is to add citation info into the parameter data files, but only register it when they get used, not when the parameter files are read.

flavio/classes.py Outdated Show resolved Hide resolved
@peterstangl
Copy link
Collaborator

One issue that has occurred to me:
In some cases, a lot of the "citation dependence" for a theory prediction lives in the parameters. E.g for an SM prediction of meson mixing, I would normally cite the Inami-Lim loop function, the QCD RG factors eta, the CKM elements, and the decay constants and bag parameters - in flavio the loop function is done in code (and now cited in MJKirk@8be3df2), but all the others are just parameters. I would imagine this is true for many other observables.
I'm not sure if there's a simple way of dealing with this - I guess what's needed is to add citation info into the parameter data files, but only register it when they get used, not when the parameter files are read.

One idea would be to have all parameters stored not in a traditional dict, but in some new subclass of dict that registers a reference anytime a certain key is accessed. This could look similar to

class dict_cite(dict):
    
    def __getitem__(self, key):
        citations.register_key(key) 
        return dict.__getitem__(self, key)

    def get(self, key, *args, **kwargs):
        citations.register_key(key)
        return dict.get(self, key, *args, **kwargs)

But this might also lead to some problems. E.g. if you show the content of such a dict in a Jupyter notebook, it would access all keys and thus register the references for all its items. One work around would be to allow switching on and off the citations.register_key method in such a way that it ignores any calls when it is switched off (e.g. by setting some attribute of the Citations class to False). Then the citations.register_key method could e.g. be switched on and off by the Prediction class such that it would only register the references if a key of the parameter dictionary is accessed by the Prediction class.

@MJKirk
Copy link
Contributor Author

MJKirk commented Oct 29, 2020

Yes, something exactly like AwareDict would work - a ParameterCitationsAwareDict as it were (a snappier name might be in order).

Regarding the inspire info for the parameters, currently the parameter data is split over 3 files: parameters_uncorrelated, parameters_correlated, and parameter_metadata. We could just put the inspire key into the metadata file, but this seems bad since the references could easily not be updated when the numbers get changed.
It seems like it would be better (if more complicated) to merge all these into a single parameter file, much like the single measurements file, and then it will be obvious to update the inspire at the same time as the numerics.
(Actually, it looks like very early on it used to be this way and then the file was split in b30aa73, do you remember why you did this @DavidMStraub?)

@peterstangl
Copy link
Collaborator

peterstangl commented Oct 30, 2020

Thanks, these are all good points! But since they are only indirectly related to the theory citation function, I opened the new issue #124 for further discussion.

@MJKirk
Copy link
Contributor Author

MJKirk commented Nov 3, 2020

Notes for myself on what I'm doing

  • Add references based on where in the code an arxiv reference is mentioned (grepped for arxiv or hep-ph)
  • Check the references noted in the release notes for v2

@MJKirk
Copy link
Contributor Author

MJKirk commented Dec 7, 2020

So I've now gone through and added citations based arxiv references in the comments / docstrings.

The release notes for v2 mention a few specific papers - for the Higgs physics I've added citations for the paper by David and Adam Falkowski, while for the beta decay and the updated B->D(*) form factors the "new stuff" is basically just in new parameters, which goes back to our discussion earlier about citing parameters (and then became #124).

There are two things which seem like they should come from somewhere specific but have no comments:

  1. The ee->WW scattering
    r"""Functions for $e^+ e^-\to W^+ W^- scattering"""

    All the specific numbers in the formulae suggest to me they were pulled from somewhere.
  2. Neutrino trident
    sigma_tot = abs(CA_m)**2 + abs(CV_m)**2 + abs(CA_NP_e)**2 + abs(CV_NP_e)**2 + abs(CA_NP_t)**2 + abs(CV_NP_t)**2

    This general expression isn't in the "original" nu trident paper 1406.2332, and it's not obvious to me that everything else drops out such that you only need the wilson coefficients. Was this expression taken from a paper or maybe it is sufficiently simple with a bit of thought.

Both observables were added by @DavidMStraub so he should know.

Also add license notice
By default, just cite the flavio paper.
Removed the read_citation method, as rather than reading the bibtex
from a file, it seems better to just ouput the INSPIRE texkeys and
worry about getting bibtex from INSPIRE later.
(E.g. if you print to a .tex file, you could use INSPIRE's bibliography
generator at https://inspirehep.net/bibliography-generator, or if fixed
David's inspiretools python package.)
The way flavio does Gamma_12 is based on the a,b,c notation originally
introduced in the 2003 BBLN paper
Gives the theory papers to be cited for an observable, using a throwaway
instance.
As part of this, move the registering of the flavio paper out of the
constructor so you don't get that as well.
Since this only gets used here, just do it in the constructor.
(In the original PyBaMM version the _reset just gets used for testing,
which I think we want to do slightly differently anyway.)
This reverts commit a6a545e.
Yeah, we need the reset method for testing...
Return something instead of printing directly, unless outputting to a
file.
Note the CP asymmetry needs citations adding, wherever the numbers in
ka1_r, ka2_r, ka1_i came from.
Just the Inami-Lim function - everything else worth citing comes in
through parameters (fB, B, RG factors)
Allow for arguments to be passed to prediction function.
Also change the name as you are really getting the citations for a
general theory prediction, not just the SM.
Forgot to update the test when I renamed the function
@peterstangl
Copy link
Collaborator

That's also good points! Thread safety might actually be important since multiprocessing is even used within flavio, e.g. when computing covariance matrices. And it would be quite inconvenient if the citation feature would not work in such a case. Getting all the references for computing a covariance matrix seems to be a reasonable task that should not require computing the covariance matrix on a single thread.

But how can the citation feature be made thread safe? I have never really thought about such problems before but I think it requires storing the references in some object that can be shared between different processes.

One possibility that I found is to use multiprocessing.Manager() to create a shared dictionary. The dictionary's keys can be used like a set and all values can be set to None.
A small example code is

import multiprocessing

manager = multiprocessing.Manager()
shared_dict = manager.dict()

def job(string):
    shared_dict[string] = None

pool = multiprocessing.Pool(2)
pool.map(job, ['a', 'b', 'c', 'a'])
pool.close()
pool.join()

Then shared_dict.keys() will return ['a','b','c'].
Since we don't need different instances of the Citations class, we can just define a class attribute that is a shared dictionary and only use class methods. citations.py could then look like

import multiprocessing

class Citations:

    _papers_to_cite =  multiprocessing.Manager().dict()

    @classmethod
    def reset(cls):
        cls._papers_to_cite.clear()

    @classmethod
    def register(cls, inspire_key):
        cls._papers_to_cite[inspire_key] = None

    @classmethod
    def to_set(cls):
        return set(cls._papers_to_cite.keys())

    @classmethod
    def to_string(cls):
        return ",".join(cls._papers_to_cite.keys())

__init__.py would than just need the line

from .citations import Citations

and references could be added by simply calling

flavio.Citations.register('my_inspire_key')

I don't know if it's actually a good idea to implement it like this, but a first try works as expected:

import multiprocessing
import flavio

def job(string):
    flavio.Citations.register(string)
    
pool = multiprocessing.Pool(2)
pool.map(job, ['a', 'b', 'c', 'a'])
pool.close()
pool.join()

Then flavio.Citations.to_set() returns {'a', 'b', 'c'}.

@peterstangl
Copy link
Collaborator

Do you know if it also works when using threading or multiprocess?

threading is not a problem for both multiprocess.Manager() and multiprocessing.Manager(). However, unfortunately multiprocessing.Manager() doesn't work with multiprocess and multiprocess.Manager() doesn't work with multiprocessing. But it is possible to switch between different managers during runtime, which would still allow to use e.g. multiprocess instead of multiprocessing (see below).

Concerning the API, what you are suggesting is a singleton pattern, but I think in this case there is absolutely no benefit in making this a class. It's even confusing as a user could just to citations = flavio.citations.Citations(), but they are not supposed to. Consider the following, essentially equivalent code:
[...]
It's simply a module, not a class, and we can do

flavio.citations.register('my_inspire_key')

I agree that having a class with only class methods was maybe not the best idea. I think it's a very good idea to use a module. But this module can actually be a class instance that has nice features like e.g. properties. This is described e.g. at https://stackoverflow.com/questions/2447353/getattr-on-a-module/7668273#7668273 after "Update" .

I have a working implementation based on this that makes the following things possible:

  • flavio.citations is a module that can be imported in __init__.py with from . import citations, but it is also a class instance that has properties like flavio.citations.string and flavio.citations.set and also things like str(flavio.citations) or list(flavio.citations) work as expected.
  • Citations can be added with flavio.citations.register('my_inspire_key').
  • There is a flavio.citations.reset() method that resets the citations such that they contain only the flavio paper.
  • There is a flavio.citations.clear() method that completely clears the citations.
  • One can register citations in different processes in a multiprocessing environment thanks to multiprocessing.Manager().
  • Registering citations in different threads in a threading environment works as well.
  • One can switch the manager from multiprocessing.Manager() to e.g. multiprocess.Manager() using the flavio.citations.switch_manager() method.
  • A context manager is available from flavio.citations.scope that works as follows:
    with flavio.citations.scope as citations:
        flavio.sm_prediction("DeltaGamma_s")
        citations.register('my_inspire_key')
        result = citations.set
    such that result is {'Beneke:2003az', 'Inami:1980fz', 'my_inspire_key'} but flavio.citations is not changed.
  • the context manager is used in the theory_citations method of the Observable class.

All of this is achieved with the following citations.py:

from multiprocessing import Manager
import flavio
import sys


class CitationScope(object):

    def __enter__(self):
        self._citations_global = flavio.citations._papers_to_cite
        flavio.citations._papers_to_cite = flavio.citations._manager.dict()
        return flavio.citations

    def __exit__(self, type, value, traceback):
        flavio.citations._papers_to_cite = self._citations_global


class Citations:

    _manager = Manager()
    scope = CitationScope()

    def __init__(self):
        self._papers_to_cite = self._manager.dict()
        self.register("Straub:2018kue")

    def __iter__(self):
        for citation in self._papers_to_cite.keys():
            yield citation

    def __str__(self):
        return ",".join(self._papers_to_cite.keys())

    @property
    def string(self):
        return str(self)

    @property
    def set(self):
        return set(self)

    def register(self, inspire_key):
        self._papers_to_cite[inspire_key] = None

    def clear(self):
        self._papers_to_cite.clear()

    def reset(self):
        self.clear()
        self.register("Straub:2018kue")

    def switch_manager(self, manager):
        self._manager = manager
        self.__init__()


sys.modules[__name__] = Citations()

If we agree that this is actually a good solution for the points we were discussing, I can push my version to the MJKirk:theory-citation-function branch if @MJKirk agrees and doesn't have any uncommited changes there.

@peterstangl
Copy link
Collaborator

I agree, the context manager was not very convincing. I have implemented your suggestions, which has actually the nice side effect that since the context manager is a method now, it can receive a Manager() instance as an argument. This makes using e.g. multiprocess more convenient:

with flavio.citations.collect(multiprocess.Manager()) as citations:
    pool = multiprocess.Pool(2)
    pool.map(job, ['a','b','c','a'])
    pool.close()
    pool.join()

this manager is also passed on to the context manager used inside the theory_citations method of Observable, which looks like

    def theory_citations(self, *args, **kwargs):
        with flavio.citations.collect() as citations:
            flavio.sm_prediction(self.name, *args, **kwargs)
        return citations.set

I have made also some other improvements, in particular one doesn't lose the references anymore when using the switch_manager method and the initial citations are now passed as an argument to the Citations class instead of hard coding them inside the class.
The new version of citations.py looks as follows:

from multiprocessing import Manager
import flavio
import sys
from itertools import chain


class CitationScope:

    def __init__(self, manager=None):
        if manager is not None:
            self._manager = manager
        else:
            self._manager = flavio.citations._manager

    def __enter__(self):
        self._citations_global = flavio.citations
        flavio.citations = Citations(self._manager)
        return flavio.citations

    def __exit__(self, type, value, traceback):
        flavio.citations = self._citations_global


class Citations:

    collect = CitationScope

    def __init__(self, manager, initial_citations=[], copied_citations=[]):
        self._manager = manager
        self._initial_citations = initial_citations
        self._papers_to_cite = self._manager.dict()
        self._papers_to_cite.update(
            {k:None for k in chain(initial_citations, copied_citations)}
        )

    def __iter__(self):
        for citation in self._papers_to_cite.keys():
            yield citation

    def __str__(self):
        return ",".join(self._papers_to_cite.keys())

    @property
    def string(self):
        return str(self)

    @property
    def set(self):
        return set(self)

    def register(self, inspire_key):
        self._papers_to_cite[inspire_key] = None

    def clear(self):
        self._papers_to_cite.clear()

    def reset(self):
        self.clear()
        self._papers_to_cite.update({k:None for k in self._initial_citations})

    def switch_manager(self, manager):
        self.__init__(manager, self._initial_citations, self.set)


sys.modules[__name__] = Citations(Manager(), ["Straub:2018kue"])

@MJKirk
Copy link
Contributor Author

MJKirk commented Feb 15, 2021

Okay, having caught up with the discussion it all sounds good, and the new code looks nice to me.
I don't have anything uncommitted so feel free to push your code @peterstangl.

@peterstangl
Copy link
Collaborator

I have thought about this again and also considered the computing speed of the approach. I noticed that these Manager().dict() objects are actually very slow. In particular, writing a single entry to such a shared dict took me around 20 μs, while the same write to a normal dict took me only 30 ns... Given how often the register() method might actually be called during the computation of theory predictions, I'm afraid the current approach is a bit impractical.

I played around a bit with other shared objects that might be faster. I tried out multiprocessing.Queue, which is considerably faster and takes only 1 μs for writing an entry. However, I ran into a problem described at https://docs.python.org/3/library/multiprocessing.html#multiprocessing-programming under 'Joining processes that use queues' that led to the execution being blocked if pool.join() is called before the queue is emptied. So I think queues cannot help us here.

Another shared object that is even faster is multiprocessing.Array. However, such an array has a fixed length and is not convenient for storing text strings. Nevertheless, I came up with a way of using it and in my implementation, a call to the register() method takes only 600 ns, which might be fast enough. Furthermore, this approach works with multiprocessing, multiprocess, and threading out of the box and there is no need to use Manager().
The drawback of my implementation is that I have to get a set of all possible citations, which I currently do by using a regexp on all .py files in the flavio source folder. This is admittedly a bit ugly. Having this set, I enumerate it and use a boolean array of the same length in which register() switches the entry corresponding to a given inspire id to True.
This means that the possible arguments of register() are limited to those contained in the flavio source and register() cannot be called with other arguments by the user. But this I wouldn't consider to be a too big problem since the citation functionality is mostly about adding references to things in the flavio source code.

So in the end, I think we have the following possibilities:

  • Use the implementation with Manager().dict() but add an on/off switch to flavio.citations in such a way that register() is just doing a pass if it's switched off. Then one could have it switched off by default and only switch it on by the context manager (i.e. also for the theory_citations method of Observable).
    Pros:

    • By default, the register() calls don't take any time and the code is as fast as ever.

    Cons:

    • A Manager() class is needed that has to be switched when using multiprocess.
    • To get the citations for some user-defined code, all the code has to be put into a context manager.
    • Code run with the citation switch on might be considerably slowed down by the slow Manager().dict().
  • Use a conventional dict()
    Pros:

    • The register() calls are very fast and the code is essentially as fast as ever.

    Cons:

    • No thread or process safety. The citation feature would not work as expected already for simple things like flavio.sm_uncertainty called with threads=2.
  • Use a shared Array
    Pros:

    • The register() calls are reasonably fast.
    • register() is thread and process safe and works with multiprocessing, multiprocess, and threading.
    • No Manager() class is needed.

    Cons:

    • A set of all possible citations has to be determined. This might involve applying regexps to the source code, which seems not completely right.
    • register() can only be used inside the flavio source code but not in code written by the user (or the user has to provide the set of possible citations somehow) (or there is a different way of getting all possible citations instead of reading the source code)

@MJKirk @DavidMStraub what do you think?

@MJKirk
Copy link
Contributor Author

MJKirk commented Feb 16, 2021

It seems to me that it's not just sm_covariance but (probably all) things with the threads option set != 1 that don't currently work.
Which then breaks what I had in mind of being the standard use of this, just calling print(flavio.default_citations) at the end of your code without any other modifications and getting the citations. Taking an example of some simple fit to the flavour anomalies:

import flavio
from flavio.statistics.likelihood import FastLikelihood
from wilson import Wilson
import flavio.plots as fpl

my_obs = (
	("<Rmue>(B+->Kll)", 1.1, 6.0),
	("<Rmue>(B0->K*ll)", 0.045, 1.1),
	("<Rmue>(B0->K*ll)", 1.1, 6.0)
)

L = FastLikelihood(name = "likelihood test", observables = my_obs)
L.make_measurement(threads = 4)

def my_LL(wcs):
	ReC9mu, ImC9mu = wcs
	par = flavio.default_parameters.get_central_all()
	wc = Wilson({"C9_bsmumu" : ReC9mu + 1j* ImC9mu, "C10_bsmumu" : -(ReC9mu + 1j*ImC9mu)},
	            scale = 4.8, eft = "WET", basis = "flavio")
	return L.log_likelihood(par, wc)

C9contour_data = fpl.likelihood_contour_data(my_LL, -2, 0, -3.5, 3.5, n_sigma = (1, 2), threads = 4)

print(flavio.default_citations)

The final line just prints
Straub:2018kue
whereas Straub:2018kue,Straub:2015ica,Gubernari:2018wyi,Seidel:2004jh,Beneke:2004dp,Asatrian:2003vq,Greub:2008cy,Bobeth:2013uxa
is what you should get (and do if you change either of the threads=4 to threads=1).
As I say, my idea is this easy one-liner would work, and therefore require minimum effort for people to use, so my preference is to come up with some solution such that this does work.

Have you done any "realistic" tests of how much the Manager().dict() approach changes the overall runtime @peterstangl? You say it takes about 20 microseconds per write, how does that compare to the time taken for everything else? I'm just thinking if the overall effect is actually only a 1% increase that isn't actually that bad.

@peterstangl
Copy link
Collaborator

Have you done any "realistic" tests of how much the Manager().dict() approach changes the overall runtime @peterstangl? You say it takes about 20 microseconds per write, how does that compare to the time taken for everything else? I'm just thinking if the overall effect is actually only a 1% increase that isn't actually that bad.

%%timeit
flavio.sm_prediction("BR(Bs->mumu)")

yields around 470 μs with the conventional dict() and 570 μs with the Manager().dict() version. So in this case it's an increase by 20%... For the Array() version, one gets around 470 μs like with the dict()

For theory predictions that have to call register() many times, its even worse:

%%timeit
flavio.sm_prediction("<Rmue>(B0->K*ll)", 0.045, 1.1)

yields around 170 ms for both dict() and Array() but 240 ms for Manager().dict(), so the increase is 40%...

Concerning the regexp stuff used by the Array() version, this could be easily avoided if in addition to the register() commands inside the code, one would also have a YAML file that contains all the inspire ids that are used. This YAML file could be automatically generated/updated by a standalone script that uses the regexp stuff, but it could also be manually edited when new citations are added to the code.

@MJKirk
Copy link
Contributor Author

MJKirk commented Feb 16, 2021

Okay, yeah, that's too much of a slowdown. Darn.

It seems to me that the Array() + regexp solution, with some preprocessing as you suggest, makes for the most user friendly version -- from their perspective everything works automatically.
People adding / editing code (might) have to worry about adding inspire ids to the YAML, but as you say that could be done semi-automatically by some script.

@peterstangl
Copy link
Collaborator

OK, I've now pushed a new version using multiprocessing.Array. The list of all possible citations is taken from a YAML file in order to avoid the potentially error-prone extraction using regexp in flavio's main routines. A standalone helper script is included that generates this YAML file from the source code using regex. This script should only be run by people editing the citations in the source code.

@MJKirk @DavidMStraub what do you think about this solution?

@MJKirk
Copy link
Contributor Author

MJKirk commented Feb 19, 2021

Generally I'm happy. I don't think it's a problem that you can only cite papers within the main flavio source code, since the "point" of this feature is to cite the source of flavio;s internals. (Although actually if you did want to for some personal reason, you could just add to the citations yaml file manually.)

In terms of some small improvements, should there be a nicer error message if you try and register a paper that isn't in citations.yml? Just like a reminder that new citations need to be added or the script needs to be re-run, whereas right now you just get KeyError.

@peterstangl
Copy link
Collaborator

Generally I'm happy. I don't think it's a problem that you can only cite papers within the main flavio source code, since the "point" of this feature is to cite the source of flavio;s internals. (Although actually if you did want to for some personal reason, you could just add to the citations yaml file manually.)

Well in principle it would also be easy to add a class method to flavio.citations that would allow the user to add references. The user-defined references just have to be added to the set flavio.citations.__all_citations and then a new Citations instance has to be created. Something like this could also be done in the context manager (e.g. by passing the user-defined references to collect()). But I'm not sure if that's actually necessary or if the feature should only be meant for citations in the flavio source code.

In terms of some small improvements, should there be a nicer error message if you try and register a paper that isn't in citations.yml? Just like a reminder that new citations need to be added or the script needs to be re-run, whereas right now you just get KeyError.

Yes, good point. I've just added this to the register method.

Apart from this, I think we still need some docstrings and some tests.

@MJKirk
Copy link
Contributor Author

MJKirk commented Feb 22, 2021

But I'm not sure if that's actually necessary or if the feature should only be meant for citations in the flavio source code.

Yeah, I'm of the opinion that that is all we need to support, so it's fine as is.

Yes, good point. I've just added this to the register method.

👍

Docstrings and some more tests I can work on in the next day or two

Specifically test that a multithreaded computation works and gives the
same citations as single threaded.
Remove the test for a "unknown" inspire key, since it becomes a "known"
inspirekey as soon as we re-run update_citations.py.

Add a test to check that the list of "known" citations matches what's in
the source code.
@peterstangl
Copy link
Collaborator

The Travis CI checks currently fail due to a problem with the new PyPI version of wilson. I have reported this already in issue wilson-eft/wilson#67.

@MJKirk
Copy link
Contributor Author

MJKirk commented Feb 24, 2021

Ah okay. Sadly Github doesn't seem to have an emoji for "phew, I'm glad I didn't manage to break everything with one simple commit" 😂

@peterstangl
Copy link
Collaborator

@MJKirk I've added a small commit that I still had lying around and that slightly improves the regexp in the extract_citation function. And now since wilson has been fixed, also all test have been passed successfully 😀

Is there still something you want to add? I think in principle, this PR looks quite good. @DavidMStraub what do you think?

@MJKirk
Copy link
Contributor Author

MJKirk commented Mar 2, 2021

No, there's nothing else from me, I'm happy for you to merge it.

Maybe I can just open an issue to list the couple of observables where I wasn't sure, for future reference rather than them being buried in the middle of this long PR? (It was ee->WW, neutrino trident, and perhaps B decay generally).

@peterstangl peterstangl merged commit 16fc707 into flav-io:master Mar 6, 2021
@peterstangl
Copy link
Collaborator

No, there's nothing else from me, I'm happy for you to merge it.

OK, done!

Maybe I can just open an issue to list the couple of observables where I wasn't sure, for future reference rather than them being buried in the middle of this long PR? (It was ee->WW, neutrino trident, and perhaps B decay generally).

Yes, I think it's a good idea to open a new issue for this!

@MJKirk MJKirk deleted the theory-citation-function branch March 8, 2021 12:25
cmarinbe pushed a commit to cmarinbe/flavio that referenced this pull request Oct 20, 2021
* Copy over PyBaMM citations code

Also add license notice

* Minimal working version of citations function

By default, just cite the flavio paper.
Removed the read_citation method, as rather than reading the bibtex
from a file, it seems better to just ouput the INSPIRE texkeys and
worry about getting bibtex from INSPIRE later.
(E.g. if you print to a .tex file, you could use INSPIRE's bibliography
generator at https://inspirehep.net/bibliography-generator, or if fixed
David's inspiretools python package.)

* A initial citation reference for testing

The way flavio does Gamma_12 is based on the a,b,c notation originally
introduced in the 2003 BBLN paper

* Add SM_citations function to Observable class

Gives the theory papers to be cited for an observable, using a throwaway
instance.
As part of this, move the registering of the flavio paper out of the
constructor so you don't get that as well.

* Merge _reset method into constructor

Since this only gets used here, just do it in the constructor.
(In the original PyBaMM version the _reset just gets used for testing,
which I think we want to do slightly differently anyway.)

* Add docstring to SM_citations

* Copy over the PyBaMM citation test class

* Revert "Merge _reset method into constructor"

This reverts commit a6a545e.
Yeah, we need the reset method for testing...

* Rejig the print functions to return a list

Return something instead of printing directly, unless outputting to a
file.

* Add some tests of the citation functionality

* Fix typo in docstring

* Citations for B->Xgamma

Note the CP asymmetry needs citations adding, wherever the numbers in
ka1_r, ka2_r, ka1_i came from.

* Citations for meson mixing

Just the Inami-Lim function - everything else worth citing comes in
through parameters (fB, B, RG factors)

* Add citations for W and Z observables

* Improve theory citation function

Allow for arguments to be passed to prediction function.
Also change the name as you are really getting the citations for a
general theory prediction, not just the SM.

* Add citations for beta decay

* Fix failing test

Forgot to update the test when I renamed the function

* Citations for mu and tau decays

* Citations for quark mass conversions

* Citations for K decays

* Add D decay form factor citations

* Add a bunch of b decay citations

* Add bvll citations

* Add B -> P formfactor citations

* Add B -> gamma formfactor citations

* Add citations for B->V formfactors

* Small change to docstrings

* Add citations for Higgs stuff

I'm assuming all the numbers come (indirectly) from 1911.07866 as
mentioned in the release notes for flavio v2

* Adds citations for B->llgamma

* Fix indentation

Got borked when I rebased

* Move copyright notice to appropriate file

* Give citations instance a unique name

* Correct docstring

* Fix code typo

Somehow this find+replace got mucked up

* Improve internals of the citation class

Update tests and the theory_citation function too

* Remove tex citation string method, add properties

* Add convenience `register_citation` method

* use multiprocessing.Array

* raise meaningful error meassge if inspire key not in YAML

* Add docstrings for the main methods

* Update and add tests

Specifically test that a multithreaded computation works and gives the
same citations as single threaded.

* Update tests

Remove the test for a "unknown" inspire key, since it becomes a "known"
inspirekey as soon as we re-run update_citations.py.

Add a test to check that the list of "known" citations matches what's in
the source code.

* improve regexp in `extract_citations` function

Co-authored-by: Peter Stangl <peter.stangl@ph.tum.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants