negative binomial tom for classical area/point sources by pabloitu · Pull Request #7870 · gem/oq-engine

pabloitu · 2022-05-27T21:14:34Z

(Draft) Created new tom implementation for uncorrelated sources, with mu/alpha parametrization of NB (Kagan, 2010)). Modified sourceconverter/writer to be able to handle NB-Tom XML-Nodes from source_model files. And modified context manager to handle a parametric_source with multiple rupture occurrence.

micheles · 2022-06-09T07:33:43Z

This is the most advanced contribution (in terms of impact on the engine core) we ever received in 10 years. You clearly did an impressive amount of work.

However, when working at this level of sophistication, the devil is in the details: the new feature causes a significant slowdown for code not using the negbin distribution.

For instance do the following in current master:

$ oq run demos/hazard/LogicTreeCase2ClassicalPSHA/job.ini && oq show performance
| calc_31715, maxmem=2.4 GB  | time_sec  | memory_mb | counts |
|----------------------------+-----------+-----------+--------|
| total classical            | 12.3      | 4.52344   | 27     |
| make_contexts              | 6.30520   | 0.0       | 14_388 |
| ClassicalCalculator.run    | 3.89814   | 22.2      | 1      |

In your branch the performance is terrible since all the time in spent in saving the ruptures:

$ oq run demos/hazard/LogicTreeCase2ClassicalPSHA/job.ini && oq show performance
| calc_31714, maxmem=2.3 GB  | time_sec  | memory_mb | counts |
|----------------------------+-----------+-----------+--------|
| ClassicalCalculator.run    | 46.6      | 23.4      | 1      |
| saving rup_data            | 42.8      | 0.25781   | 27     |
| total classical            | 16.4      | 4.46094   | 27     |

My suggestion would be to split this PR in many small PRs adding various pieces, checking at every moment that you do not lose performance in calculations not using the feature. When you find a PR with a performance hit, please ask here for help.

pabloitu · 2022-06-09T10:22:13Z

Michele, thank you for your reply! I was finger crossing that I would not cause a performance issue, so thank you for pinpointing what could be causing that. I will take a look step by step what is causing the performance break, and when I have identified it, we could discuss an elegant solution for it.

mmpagani · 2022-06-09T15:18:12Z

Pablo, I gave a look at the PR and did not manage to find at first sight a component of the code where what you are adding might impact the other sources. I will give a more thorough look. Note moreover that we have a test broken that we need to figure out why it's not passing.

pabloitu · 2022-06-14T00:05:47Z

I fixed what was slowing the overall process. Apparently I was unfolding the context rates everytime I wanted to check in concat() whether if a ctx was Poisson, Nonparam or Negative Binomial. But, to check the failing tests: Could you point me to a doc or tell me how can I run them myself?

micheles · 2022-06-14T06:25:19Z

Update your branch from the latest master, install pytest and then

$ pytest -vsx openquake/hazardlib/tests/sourcewriter_test.py

Negative Binomial implementation with uncorrelated sources. - Added NegativeBinomialTOM class in hazardlib.tom. Calculate probabilities of no-exceedance approximating the infinite series. - Modified sourcewriter.py to write a source group, where some of its pointsources could be NB. - Modified sourceconverter.py to read source models with a temporal_occurrence_model node in the definition of Point Sources. - Modified contexts.py to use the temporal model of a point source (which has the parameters for each NB point, instead of the MFD) - For rupture_contexts that have both occurrence_rate (mean_rate) and probs_occur (e.g. negbinomial or other parametric non-poisson distributions), bypassed the concateniation between context. It is due to numpy forcing for probs_occur to have the same shape. fixed typo fixed typo Added NB doc and set a n_max fixed for NB pdf (7), so contexts can handle the multiple_shape array of context.probs_occur Added NB tom test - Fixed NB writer to write properly a negbinom tom - Modified NegativeBinomial.get_pmf() to handle mean_rates of type int and np.ndarray - Modified contexts to handle properly a NB TOM, when multiple rupture planes and hypocentral depths are present. Negative Binomial implementation with uncorrelated sources. - Added NegativeBinomialTOM class in hazardlib.tom. Calculate probabilities of no-exceedance approximating the infinite series. - Modified sourcewriter.py to write a source group, where some of its pointsources could be NB. - Modified sourceconverter.py to read source models with a temporal_occurrence_model node in the definition of Point Sources. - Modified contexts.py to use the temporal model of a point source (which has the parameters for each NB point, instead of the MFD)

Fixed sourcewriter to ignore sourcegroup tom. - modified tom name in sourcewriter - Modified concat to not access full occurrence_rate array while checking for NB - Fixed NB qa_tests

Negative Binomial implementation with uncorrelated sources. - Added NegativeBinomialTOM class in hazardlib.tom. Calculate probabilities of no-exceedance approximating the infinite series. - Modified sourcewriter.py to write a source group, where some of its pointsources could be NB. - Modified sourceconverter.py to read source models with a temporal_occurrence_model node in the definition of Point Sources. - Modified contexts.py to use the temporal model of a point source (which has the parameters for each NB point, instead of the MFD) - For rupture_contexts that have both occurrence_rate (mean_rate) and probs_occur (e.g. negbinomial or other parametric non-poisson distributions), bypassed the concateniation between context. It is due to numpy forcing for probs_occur to have the same shape. fixed typo fixed typo Added NB doc and set a n_max fixed for NB pdf (7), so contexts can handle the multiple_shape array of context.probs_occur Added NB tom test - Fixed NB writer to write properly a negbinom tom - Modified NegativeBinomial.get_pmf() to handle mean_rates of type int and np.ndarray - Modified contexts to handle properly a NB TOM, when multiple rupture planes and hypocentral depths are present. Negative Binomial implementation with uncorrelated sources. - Added NegativeBinomialTOM class in hazardlib.tom. Calculate probabilities of no-exceedance approximating the infinite series. - Modified sourcewriter.py to write a source group, where some of its pointsources could be NB. - Modified sourceconverter.py to read source models with a temporal_occurrence_model node in the definition of Point Sources. - Modified contexts.py to use the temporal model of a point source (which has the parameters for each NB point, instead of the MFD)

Fixed sourcewriter to ignore sourcegroup tom. - modified tom name in sourcewriter - Modified concat to not access full occurrence_rate array while checking for NB - Fixed NB qa_tests

…ntext, and then concatenates by identical shape of probs_occur

…ative binomial case_78 test to classical_test

pabloitu · 2022-06-21T21:54:23Z

I've managed to run all the tests, find some details in my modifications that made them fail.

micheles · 2022-06-22T13:58:46Z

case_78 is still red

pabloitu · 2022-06-22T21:00:42Z

sorry, missed an init in that the case_78 for negbinom

micheles · 2022-06-23T06:56:43Z

openquake/hazardlib/contexts.py

+    if parametric_np:
+        for shp in set(ctx.probs_occur.shape[1] for ctx in parametric_np):
+            p_array = [p for p in parametric_np if p.probs_occur.shape[1] == shp]
+            out.append(numpy.concatenate(p_array).view(numpy.recarray))


Could you add a comment explaining the logic here?

micheles · 2022-06-23T07:18:31Z

openquake/hazardlib/sourceconverter.py

                  :class:`openquake.hazardlib.mfd.TruncatedGRMFD` instance
        """
+
+        if node.tag.endswith('pointSource'):        # Check if there a tom specified at the point level


This is not clear to me: what happens if you have a non-pointsource? How do you specify the TOM? Also, if you have 1 million sources specifying the TOM for each one makes little sense. My understanding is the TOM should be set at the SourceGroup level, not at the source level, i.e. there are multiple sources with the same TOM. Let me check with @mmpagani and we will implement that feature, including a general way of passing extra parameters to the TOM subclass.

It turns out it is already implemented, see this example: https://github.com/gem/oq-engine/blob/master/openquake/qa_tests_data/classical/case_35/source_model.xml
The place to touch is the method get_tom in the SourceConverter.

micheles · 2022-06-23T07:40:33Z

openquake/hazardlib/tom.py

+    """
+    Negative Binomial temporal occurrence model.
+    """
+    def __init__(self, time_span, occurrence_rate=None, parameters=None):


I don't like to use a generic parameters. Just use mu and alpha, as mandatory parameters. Also the occurrence_rate is not needed, see #7931

…ametric and Parametric Non-Poisson

# Conflicts: # openquake/hazardlib/sourceconverter.py

…hanged tom to become an attribute of a Source node, in consistency to nonparametric or cluster, rather than an making it an extra subnode. Modified NegativeBinomialTOM class to explicitly give mu and alpha as param.

micheles · 2022-06-24T07:46:06Z

openquake/hazardlib/sourceconverter.py

+            # if tom is negbinom, sets mu and alpha attr to tom_class
+            if node['tom'] == 'NegativeBinomialTOM':
+                kwargs = {'alpha': eval(node['alpha']),
+                          'mu': eval(node['mu'])}


Please use ast.literal_eval here, which is safe

micheles · 2022-06-24T08:45:52Z

openquake/qa_tests_data/classical/case_78/source_negbinom.xml

+            name="point00000"
+            tom="NegativeBinomialTOM"
+            mu="0.1"
+            alpha="2.0"


tom, mu and alpha should go inside the sourceGroup, not inside pointSource.

micheles · 2022-06-24T08:47:08Z

openquake/hazardlib/sourcewriter.py

+        if isinstance(tom, NegativeBinomialTOM):
+            attrs['tom'] = 'NegativeBinomialTOM'
+            attrs['mu'] = tom.mu
+            attrs['alpha'] = tom.alpha


Not the right place for this, since the tom instance should not be an attribute of PointSource, only of SourceGroup.

After discussion, it became clear that for the New Zealand model they really need by source TOMs, so I will retract my objection. Still the sourcewriter will not work for non-point-sources with NegativeBinomialTOM.

micheles · 2022-06-29T08:28:05Z

LGTM

mmpagani · 2022-06-29T08:40:04Z

LGTM

@pabloitu further checks to be performed and not covered by this PR:

event-based calculation
disaggregation calculation

Both are important since the former is needed for risk analyses while disaggregation provides important information for engineering applications.

micheles added this to the Engine 3.15.0 milestone May 31, 2022

micheles added the enhancement label May 31, 2022

pabloitu force-pushed the negbinom_3.15 branch 3 times, most recently from c88f7b5 to 5fd3019 Compare June 3, 2022 23:10

pabloitu force-pushed the negbinom_3.15 branch from 6f77cac to d65b3ad Compare June 13, 2022 22:22

pabloitu force-pushed the negbinom_3.15 branch from d65b3ad to 26854f9 Compare June 15, 2022 19:53

pabloitu added 2 commits June 17, 2022 15:12

Simplified dimensions handling of ctx.probs_occur

fb72d70

Fixed sourcewriter to ignore sourcegroup tom. - modified tom name in sourcewriter - Modified concat to not access full occurrence_rate array while checking for NB - Fixed NB qa_tests

pabloitu force-pushed the negbinom_3.15 branch from 26854f9 to fb72d70 Compare June 17, 2022 13:15

pabloitu added 7 commits June 21, 2022 13:24

Simplified dimensions handling of ctx.probs_occur

5f75b68

Fixed sourcewriter to ignore sourcegroup tom. - modified tom name in sourcewriter - Modified concat to not access full occurrence_rate array while checking for NB - Fixed NB qa_tests

Merge remote-tracking branch 'origin/negbinom_3.15' into negbinom_3.15

9e3e490

Detecting in concat if both occurrence_rate and probs_occurs is in co…

fa73c57

…ntext, and then concatenates by identical shape of probs_occur

Merge remote-tracking branch 'upstream/master' into negbinom_3.15

d22fbbc

add temporal model node only if is negbinom

f026605

Added initialization shape for ctx.probs_occur in recarray. Added neg…

8c03691

…ative binomial case_78 test to classical_test

added __init__ in for negative binomial qa_test

90c55f1

micheles reviewed Jun 23, 2022

View reviewed changes

micheles mentioned this pull request Jun 23, 2022

Serialize parametric TOM instances #7931

Merged

pabloitu added 5 commits June 23, 2022 13:49

added comments for concat function that now captures Poisson, non-par…

bb9d5b4

…ametric and Parametric Non-Poisson

Merge remote-tracking branch 'upstream/master' into negbinom_3.15

5d23bf7

# Conflicts: # openquake/hazardlib/sourceconverter.py

Merge remote-tracking branch 'upstream/master' into negbinom_3.15

01bde0d

Modified tests

a485ace

micheles reviewed Jun 24, 2022

View reviewed changes

changed eval use by use of float

86efed2

micheles merged commit 9355e51 into gem:master Jun 29, 2022

micheles mentioned this pull request Jun 29, 2022

The SourceWriter will not work for non-PointSources with a nontrivial TOM #7946

Open

Conversation

pabloitu commented May 27, 2022

Uh oh!

micheles commented Jun 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pabloitu commented Jun 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mmpagani commented Jun 9, 2022

Uh oh!

pabloitu commented Jun 14, 2022

Uh oh!

micheles commented Jun 14, 2022

Uh oh!

pabloitu commented Jun 21, 2022

Uh oh!

micheles commented Jun 22, 2022

Uh oh!

pabloitu commented Jun 22, 2022

Uh oh!

micheles Jun 23, 2022

Choose a reason for hiding this comment

Uh oh!

micheles Jun 23, 2022

Choose a reason for hiding this comment

Uh oh!

micheles Jun 23, 2022

Choose a reason for hiding this comment

Uh oh!

micheles Jun 23, 2022

Choose a reason for hiding this comment

Uh oh!

micheles Jun 24, 2022

Choose a reason for hiding this comment

Uh oh!

micheles Jun 24, 2022

Choose a reason for hiding this comment

Uh oh!

micheles Jun 24, 2022

Choose a reason for hiding this comment

Uh oh!

micheles Jun 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

micheles commented Jun 29, 2022

Uh oh!

mmpagani commented Jun 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

micheles commented Jun 9, 2022 •

edited

Loading

pabloitu commented Jun 9, 2022 •

edited

Loading

micheles Jun 29, 2022 •

edited

Loading

mmpagani commented Jun 29, 2022 •

edited

Loading