This PR adds PIG 7: add proposal and work plan for improving the current model framework of Gammapy and bring it in a state which is acceptable for the Gammapy v1.0 release.
I plan to open a GitHub project for this PIG with associated issues for the list of proposed PRs at the end of the PIG.
cdeil left a comment
@adonath - Thank you for the writeup!
I think it will be very useful to find good solutions and then to implement this step by step in early 2019.
Clarifying the following points is very important, something should be said in the PIG:
Probably to be discussed a bit later:
Smaller comments inline.
@adonath - I started a notebook today to look a bit how linked parameters work in other modeling frameworks. It's at https://github.com/gammapy/gammapy-extra/blob/master/experiments/parameter_links.ipynb .
I didn't get very far, and don't have time in the coming weeks, so feel free to continue there or reference it from the PIG if you think it's useful.
The main question IMO is how to set up models and parameters for multiple datasets and concrete changes and additions needed for modeling in v0.10 or v0.11. If you think that can be done by referring to the same
I'm not very happy with the fact that we can't pass existing
But given that (at least for me) there's no clear "best" modeling API / framework, I think anything that gets the main use cases done and we should get datasets in early 2019, and then we can always refactor and improve the modeling framework later in the year. So even just using the existing post-init access internals way would be fine with me, if it allows getting datasets e.g. for v0.11.
I know that we've discussed this before, but I can't find a mention of parameter linking and distributed or parallel computing above!?
Anyways, my main points are that we should try to develop modeling & fitting in Gammapy to support
It doesn't have to be in Gammapy v1.0 in fall, but it should be there in the coming years.
Motivation for parallel computing: currently Gammapy runs in a single CPU process, but most users have access to multi-core CPUs or CPU clusters, that if used properly would give 10x to 1000x speedups. This is often "nice to have" to run an analysis in 1 minute instead of 10 minutes or 1 hour, but there are use cases where great speedups are required - e.g. Galactic center analysis with 10 overlapping sources and 100s of observations.
Motivation for gradients: less strong that need for parallel computing, for fitting up to ~ 10 sources we could probably get all results with our current optimisers (which often do numeric gradients by stepping in each direction), but overall gradient-based optimisers are often more reliable (more accurate gradients) and efficient, especially with many parameters (say ~ 100 parameters from the ~10 sources).
@adonath - what do you think? what is feasible for v1.0 (fall 2019), and what should be listed as possible directions for later (e.g. v2.0 in 2020 or even later)?
Since this PIG currently doesn't mention "uncertainty" or "covariance" or "samples" or "MCMC" or "Bayesian", I've split out that part into it's own PIG: #2255
@adonath - Thanks for closing this out, and all your work on modeling in the last year.
Please add a link to this PIG from
I think we should still put a closing date. Maybe like this?
instead of the empty