negative binomial tom for classical area/point sources #7870
negative binomial tom for classical area/point sources #7870micheles merged 16 commits intogem:masterfrom
Conversation
c88f7b5 to
5fd3019
Compare
|
This is the most advanced contribution (in terms of impact on the engine core) we ever received in 10 years. You clearly did an impressive amount of work. However, when working at this level of sophistication, the devil is in the details: the new feature causes a significant slowdown for code not using the negbin distribution. For instance do the following in current master: In your branch the performance is terrible since all the time in spent in saving the ruptures: My suggestion would be to split this PR in many small PRs adding various pieces, checking at every moment that you do not lose performance in calculations not using the feature. When you find a PR with a performance hit, please ask here for help. |
|
Michele, thank you for your reply! I was finger crossing that I would not cause a performance issue, so thank you for pinpointing what could be causing that. I will take a look step by step what is causing the performance break, and when I have identified it, we could discuss an elegant solution for it. |
|
Pablo, I gave a look at the PR and did not manage to find at first sight a component of the code where what you are adding might impact the other sources. I will give a more thorough look. Note moreover that we have a test broken that we need to figure out why it's not passing. |
|
I fixed what was slowing the overall process. Apparently I was unfolding the context rates everytime I wanted to check in concat() whether if a ctx was Poisson, Nonparam or Negative Binomial. But, to check the failing tests: Could you point me to a doc or tell me how can I run them myself? |
|
Update your branch from the latest master, install pytest and then |
Negative Binomial implementation with uncorrelated sources. - Added NegativeBinomialTOM class in hazardlib.tom. Calculate probabilities of no-exceedance approximating the infinite series. - Modified sourcewriter.py to write a source group, where some of its pointsources could be NB. - Modified sourceconverter.py to read source models with a temporal_occurrence_model node in the definition of Point Sources. - Modified contexts.py to use the temporal model of a point source (which has the parameters for each NB point, instead of the MFD) - For rupture_contexts that have both occurrence_rate (mean_rate) and probs_occur (e.g. negbinomial or other parametric non-poisson distributions), bypassed the concateniation between context. It is due to numpy forcing for probs_occur to have the same shape. fixed typo fixed typo Added NB doc and set a n_max fixed for NB pdf (7), so contexts can handle the multiple_shape array of context.probs_occur Added NB tom test - Fixed NB writer to write properly a negbinom tom - Modified NegativeBinomial.get_pmf() to handle mean_rates of type int and np.ndarray - Modified contexts to handle properly a NB TOM, when multiple rupture planes and hypocentral depths are present. Negative Binomial implementation with uncorrelated sources. - Added NegativeBinomialTOM class in hazardlib.tom. Calculate probabilities of no-exceedance approximating the infinite series. - Modified sourcewriter.py to write a source group, where some of its pointsources could be NB. - Modified sourceconverter.py to read source models with a temporal_occurrence_model node in the definition of Point Sources. - Modified contexts.py to use the temporal model of a point source (which has the parameters for each NB point, instead of the MFD)
Fixed sourcewriter to ignore sourcegroup tom. - modified tom name in sourcewriter - Modified concat to not access full occurrence_rate array while checking for NB - Fixed NB qa_tests
Negative Binomial implementation with uncorrelated sources. - Added NegativeBinomialTOM class in hazardlib.tom. Calculate probabilities of no-exceedance approximating the infinite series. - Modified sourcewriter.py to write a source group, where some of its pointsources could be NB. - Modified sourceconverter.py to read source models with a temporal_occurrence_model node in the definition of Point Sources. - Modified contexts.py to use the temporal model of a point source (which has the parameters for each NB point, instead of the MFD) - For rupture_contexts that have both occurrence_rate (mean_rate) and probs_occur (e.g. negbinomial or other parametric non-poisson distributions), bypassed the concateniation between context. It is due to numpy forcing for probs_occur to have the same shape. fixed typo fixed typo Added NB doc and set a n_max fixed for NB pdf (7), so contexts can handle the multiple_shape array of context.probs_occur Added NB tom test - Fixed NB writer to write properly a negbinom tom - Modified NegativeBinomial.get_pmf() to handle mean_rates of type int and np.ndarray - Modified contexts to handle properly a NB TOM, when multiple rupture planes and hypocentral depths are present. Negative Binomial implementation with uncorrelated sources. - Added NegativeBinomialTOM class in hazardlib.tom. Calculate probabilities of no-exceedance approximating the infinite series. - Modified sourcewriter.py to write a source group, where some of its pointsources could be NB. - Modified sourceconverter.py to read source models with a temporal_occurrence_model node in the definition of Point Sources. - Modified contexts.py to use the temporal model of a point source (which has the parameters for each NB point, instead of the MFD)
Fixed sourcewriter to ignore sourcegroup tom. - modified tom name in sourcewriter - Modified concat to not access full occurrence_rate array while checking for NB - Fixed NB qa_tests
…ntext, and then concatenates by identical shape of probs_occur
…ative binomial case_78 test to classical_test
|
I've managed to run all the tests, find some details in my modifications that made them fail. |
|
case_78 is still red |
|
sorry, missed an init in that the case_78 for negbinom |
| if parametric_np: | ||
| for shp in set(ctx.probs_occur.shape[1] for ctx in parametric_np): | ||
| p_array = [p for p in parametric_np if p.probs_occur.shape[1] == shp] | ||
| out.append(numpy.concatenate(p_array).view(numpy.recarray)) |
There was a problem hiding this comment.
Could you add a comment explaining the logic here?
| :class:`openquake.hazardlib.mfd.TruncatedGRMFD` instance | ||
| """ | ||
|
|
||
| if node.tag.endswith('pointSource'): # Check if there a tom specified at the point level |
There was a problem hiding this comment.
This is not clear to me: what happens if you have a non-pointsource? How do you specify the TOM? Also, if you have 1 million sources specifying the TOM for each one makes little sense. My understanding is the TOM should be set at the SourceGroup level, not at the source level, i.e. there are multiple sources with the same TOM. Let me check with @mmpagani and we will implement that feature, including a general way of passing extra parameters to the TOM subclass.
There was a problem hiding this comment.
It turns out it is already implemented, see this example: https://github.com/gem/oq-engine/blob/master/openquake/qa_tests_data/classical/case_35/source_model.xml
The place to touch is the method get_tom in the SourceConverter.
openquake/hazardlib/tom.py
Outdated
| """ | ||
| Negative Binomial temporal occurrence model. | ||
| """ | ||
| def __init__(self, time_span, occurrence_rate=None, parameters=None): |
There was a problem hiding this comment.
I don't like to use a generic parameters. Just use mu and alpha, as mandatory parameters. Also the occurrence_rate is not needed, see #7931
…ametric and Parametric Non-Poisson
# Conflicts: # openquake/hazardlib/sourceconverter.py
…hanged tom to become an attribute of a Source node, in consistency to nonparametric or cluster, rather than an making it an extra subnode. Modified NegativeBinomialTOM class to explicitly give mu and alpha as param.
| # if tom is negbinom, sets mu and alpha attr to tom_class | ||
| if node['tom'] == 'NegativeBinomialTOM': | ||
| kwargs = {'alpha': eval(node['alpha']), | ||
| 'mu': eval(node['mu'])} |
There was a problem hiding this comment.
Please use ast.literal_eval here, which is safe
| name="point00000" | ||
| tom="NegativeBinomialTOM" | ||
| mu="0.1" | ||
| alpha="2.0" |
There was a problem hiding this comment.
tom, mu and alpha should go inside the sourceGroup, not inside pointSource.
| if isinstance(tom, NegativeBinomialTOM): | ||
| attrs['tom'] = 'NegativeBinomialTOM' | ||
| attrs['mu'] = tom.mu | ||
| attrs['alpha'] = tom.alpha |
There was a problem hiding this comment.
Not the right place for this, since the tom instance should not be an attribute of PointSource, only of SourceGroup.
There was a problem hiding this comment.
After discussion, it became clear that for the New Zealand model they really need by source TOMs, so I will retract my objection. Still the sourcewriter will not work for non-point-sources with NegativeBinomialTOM.
|
LGTM |
|
LGTM @pabloitu further checks to be performed and not covered by this PR:
Both are important since the former is needed for risk analyses while disaggregation provides important information for engineering applications. |
(Draft) Created new tom implementation for uncorrelated sources, with mu/alpha parametrization of NB (Kagan, 2010)). Modified sourceconverter/writer to be able to handle NB-Tom XML-Nodes from source_model files. And modified context manager to handle a parametric_source with multiple rupture occurrence.