Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Hazard schema - Hazard occurrence probability / frequency #59

Closed
matamadio opened this issue May 16, 2023 · 20 comments · Fixed by #121
Closed

[Proposal] Hazard schema - Hazard occurrence probability / frequency #59

matamadio opened this issue May 16, 2023 · 20 comments · Fixed by #121
Assignees
Labels
hazard Issues related to Hazard data proposal New feature or request

Comments

@matamadio
Copy link
Contributor

matamadio commented May 16, 2023

What is the context or reason for the change?

From the original hazard schema report (pg 15-17):

- event.set.time attributes refer to the whole collection of scenario modelled.
- event occurrence attributes refer to period simulated by individual scenarios;
Name Description Type
frequency The frequency of occurrence of the present event (for the reference period see occurrence_time_span or occurrence_time_start and occurrence_time_end). float
occurrence_probability The probability of occurrence in a given time interval defined either throughthe occurrence_time_start and occurrence_time_end or through theoccurrence_time_span parameter. float
occurrence_time_start The start date (and possibly time) of the time period used to specify either thefrequency or the occurrence_probability ISO 8601
occurrence_time_end The end date (and possibly time) of the time period used to specify either thefrequency or the occurrence_probability ISO 8601
occurrence_time_span The duration of the period used to specify either the frequency or theoccurrence_probability float

This framing of hazard occurrence (probability or frequency) tries to be as general as possible, but ends up being not very practical nor intuitive, as expressing the most common occurrence probability (Return Period) would require to enter time duration fields on exact dates (which does not strictly apply for probabilistic distributions).

The documentation is not entirely clear on this - the description indicates using all three fields to relate to the '1 versus 50y issue', but the examples in the report show the fields being used to record the timing of the event, with no relation to probability.
If these refer to the duration of the event, then occurrence_time_span may be used to refer to the duration of event (in days, hours, etc) OR to the time period used for the frequency (1 year for annual frequency or 50yr for % in 50years, as described on the call). If this is meant to hold both cases, it is very confusing.

Changes have been proposed in the following version:

  • frequency_type was added to differentiate between RP, ER, EP, Susceptibility
  • return_period was added as friendlier option to express the most common frequency type instead of using occurrence_probability which would require to fill in 3 attributes for each return period.
  • occurrence_probability renamed as occurrence and kept to express frequency for ExRate and ExProb

However, this version wasn't convincing as well:

  • there is no theoretical difference between ExRate and ExProb - why differentiate?
  • Why do we have 3 different fields (RP, ExProb and ExRate) when each one one can be calculated from the others?
    RP= 1/50
    ExProb= 0.02
    ExRate= ...
    

What is your proposed change?

if event_set.analysis_type = probabilistic

Name Description Type
event_set.occurrence_range Which occurrence scenarios are included in the event_set (summary of all individual event.occurrence scenarios). string
event.occurrence The frequency for the hazard scenario (event). Can be Return Period and/or probability. Both fields are provided to allow entry of return period (most common terms used across hazards) and/or probability, which is commonly used for seismic hazard. Probability commonly refers to a probability within 1 year or 50 years, but may relative to any duration - when probability is used, span must be specified. object
event.occurrence.return_period Event return period (or recurrence interval); the estimated average time between events. Expressed as "1/10" for a 1-in-10 year return period, etc. string
event.occurrence.probability Event occurrence probability expressed as a float - for example 0.1 = 10% probability. Requires event.occurrence.span to define the period in which the probability applies. float
event.occurrence.span The duration (in years) used to specify the probability. integer

if event_set.analysis_type = empirical

Name Description Type
event.empirical.occurrence.time_start For empirical events only. Provides the event start date and time ISO 8601
event.empirical.occurrence.time_end For empirical events only. Provides the event end date and time. This may be identical to time_start, or for long-duration event may be a data and time occurring after time_start ISO 8601

if event_set.analysis_type = deterministic

Name Description Type
event.deterministic.classes Gives the thresholds classification or index values used by the deterministic event set, index approaches and susceptibility data. string

Can you provide an example?

Example 1: Probabilistic flood hazard dataset (event_set) including 3 scenarios (event): RP 20, RP 100, RP 250 years.

Name Sample value
event_set.occurrence_range 20, 100, 250
event.occurrence 5%
event.occurrence.return_period 1/20
Name Sample value
event_set.occurrence_range 20, 100, 250
event.occurrence 1%
event.occurrence.return_period 1/100
Name Sample value
event_set.occurrence_range 20, 100, 250
event.occurrence 0.4%
event.occurrence.return_period 1/250

Example 2: Probabilistic earthquake hazard dataset (event_set) including 3 scenarios (event):

  • 0.1 in 50 years.
  • ...
Name Sample value
event_set.occurrence_range
event.occurrence 0.2%
event.occurrence.probability 0.1
event.occurrence.span 50 years

Example 3: Empirical earthquake hazard dataset (event_set) including 1 event: L'Aquila earthquake

Name Sample value
event.empirical.occurrence.time_start 2009-04-06 01:32:39 UTC
event.empirical.occurrence.time_end 2009-04-06 01:45:25 UTC
@matamadio matamadio added the proposal New feature or request label May 16, 2023
@matamadio
Copy link
Contributor Author

@stufraser1 please review examples (especially EQ)

@odscjen
Copy link
Contributor

odscjen commented May 23, 2023

If I'm understanding the proposal correctly then someone could choose to give all probabilistic values e.g.

Name Sample value
event_set.occurrence_range ?
event.occurrence.return_period 1/500
event.occurrence.probability 0.1
event.occurrence.span 50

But event_set.occurrence_range is described as:

Which occurrence scenarios are included in the event_set (summary of all individual event.occurrence scenarios).

but this doesn't specify what value is to be given as the summary value for each event. In the examples in Example 1 you've used span, and in Example 2 you've not included any values for occurrence_range but based on the proposed descriptions they could be provided. Theoretically a user could provide a mix of RP's and % probabilities across the event's in the event_set (I can't imagine why they would but you never know!) so we should probably pick a single value that should be used as the summary value and specify this.

The examples 1 and 2 have given event.occurrence a value. It should not have a value assigned to it as it is an object, the values should all be assigned to its constituent fields. So the correct examples would be

Example 1 (a)

Name Sample value
event.occurrence 5%
event_set.occurrence_range 20, 100, 250
event.occurrence.return_period 1/20

Putting it into json makes this a little clearer:

{
    "hazard": {
        "event_set": {
            "occurrence_range": {
                20, 100, 250
            }
        },
        "event": {
            [
                "occurrence": {
                    "return_period": "1/20"
                }
            ]
        }
    }
}

Example 2

You've already stated in the description that the span is in years and that this field is an integer so it shouldn't include "years" in the example.

Name Sample value
event_set.occurrence_range ??
event.occurrence 0.2%
event.occurrence.probability 0.1
event.occurrence.span 50 years

Is there a reason you've used %'s for the occurrence but specified a float for probability? I think I find the % more intuitive but appreciate that float might be the value that users would be more familiar with?

As all 3 potential scenarios are still being called part of occurrence I'd recommend we make all fields part of the occurrence object, e.g.

Name Description Type
event_set.occurrence_range A summary of the probabilistic occurrence scenarios are included in the event_set. This field must only be used if event_set.analysis_type = "probabilistic" string
event.occurrence The occurrence of the event. object
event.occurrence.probabilistic The frequency for the hazard scenario (event). Can be Return Period and/or probability. Both fields are provided to allow entry of return period (most common term used across all hazards) and/or probability, which is commonly used for seismic hazard. Probability commonly refers to a probability within 1 year or 50 years, but may be relative to any duration - when probability is used, span must be specified. object
event.occurrence.probabilistic.return_period The event return period (or recurrence interval); the estimated average time between events. Expressed as "1/10" for a 1-in-10 year return period, etc. string
event.occurrence.probabilistic.probability The occurrence probability of the modelled event. object
event.occurrence.probabilistic.probability.value The value of the occurrence probability expressed as a float - for example 0.1 = 10% probability. Requires event.occurrence.span to define the period in which the probability applies. float
event.occurrence.probabilistic.probability.span The duration (in years) used to specify the probability. integer
event.occurrence.empirical The time period over which the observed event took place. object
event.occurrence.empirical.time_start The event start date and time. ISO 8601
event.occurrence.empirical.time_end The event end date and time. This may be identical to time_start, or may be a data and time occurring after time_start ISO 8601
event.occurrence.deterministic The thresholds classification or index values used by the deterministic event set, index approaches and susceptibility data. string

The main changes I'm suggesting are:

  • have occurrence.deterministic and occurrence.empirical rather than deterministic.occurrence and empirical.occurrence
  • create a occurrence.probabilistic object to hold the fields that are only for probabilistic scenarios (similar to the above point). By having these 3 clearly named objects it should be clear that each is separate and only applicable to the appropriate scenario.
  • create a probability object to hold the span and probability value fields as this way we can make both required within the object. We can leave the object itself as not required to allow for users only providing return_period, but if they do want to provide the probability as a float value then this will reinforce that span is also needed.

I'm also unclear as to what the values for deterministic represent and how they would be used. @matamadio Could you provide an example for that case?

@duncandewhurst thoughts?

@johcarter
Copy link
Collaborator

I'm new to this discussion but wanted to give the cat modelling perspective on terminology with respect to stochastic event frequency.

We would call the reciprocal of the return period not an occurrence probability but an 'event rate' or 'annual arrival rate'. This is the average number of occurrences of the event within a given year, not a probability of occurrence.

In order to work out the probability of 0,1,2.. occurrences within a given year, we would refer to a frequency distribution, such as poisson. From the assumed distribution and parameter (the event rate can be the lambda used to parameterise the Poisson, for example), we could calculate the probability of no occurrences, 1 occurrence, 2 occurrences of that event within a year etc.

Something else that is important in event frequency distributions is seasonality and clustering of multiple events in time, which the return period / event rate info does not capture. One of my suggestions for capturing this in the upcoming ODS/RDL alignment project I am working on with Stu and co, will be an extra resource file which is a list of event occurrences across a span of years. This captures the seasonality and clustering aspect of event frequency within each year. Also, stochastic event catalogues in cat models are too large to be listed in meta-data.

@duncandewhurst
Copy link
Contributor

I'll defer to the RDL team on the questions of terminology :-)

Some feedback on the proposed data modelling:

event_set.occurrence_range

I have a few concerns about this field:

  1. It's best to avoid adding fields for facts that can be calculated from other fields because it introduces the possibility of inconsistencies. If this field is simply summarising data that is already available in event.occurrence.return_period/event.occurrence.probabilistic.return_period, then it should be omitted. Consuming applications can summarise data in the way that users need, without the summary being available in the original data. Having multiple representations causes problems for data publishers (they need to make sure that their data is consistent) and data users (they might need to look in multiple places to find a fact and they need to decide how to handle inconsistent data). The exception to this would be if it's possible to know the occurrence range of a set of events whilst the individual return periods are unknown, in which case we'd need to keep the field.
  2. The example given ('20, 100, 250') is not a range in statistical terms (the difference between the largest and smallest values in a dataset), nor in common parlance (the smallest and largest values in a dataset). Rather, it's the set of return periods for the events in the event set. If we decide to keep this field, I think it should be renamed.
  3. If the value of the field is intended to be a list of values, as in the example, then its type should be array with an appropriate type of its items, i.e. number if the values will always be numbers etc.

Event sets and events

From the discussion in this issue it sounds like there is a 1:n relationship between an event set and its events, but I'm not seeing that in the schema, where both hazard.event_set and hazard.event are (singular) objects. Should the modelling instead look like this?

{
  "hazard": {
    "event_set": {
      "events": [
        {
          // event A
        },
        {
          // event B
        }
      ]
    }
  }
}

Return period

Are return periods always expressed with 1 as the numerator, e.g. 1/10, 1/20, 1/50, or is it possible to have a different numerator, e.g. 2/10, 5/50?

If the numerator is always 1, then I would model the return period as a number representing the denominator, which will be easier to validate than a string.

Percentages

A percentage is a number expressed as a fraction of 100, e.g. 45% is 45/100, so the proper way to model a percentage is as a number between 0 and 1, e.g. 0.45. That's consistent with how percentages are treated in spreadsheets: if you type 0.45 in a cell and apply percentage formatting, the result is 45%. Whereas, if you type 45 in a cell and apply percentage formatting, the result is 4500%. As long as the field's description states that it is expressed as a percentage, consuming applications can present it however makes the most sense to their users. The JSON Schema keywords minimum and maximum should be used to limit the range of permitted values.

@matamadio
Copy link
Contributor Author

matamadio commented May 25, 2023

Thanks all for the feedback.
The framing proposed by Jen make sense to me, with the changes suggested by Duncan:

  • renaming event_set.occurrence_range as event_set.occurrence_scenarios or other word
  • change the type to array
  • 1:n relationship between event_set and event

Deterministic option

An example for "deterministic" layer could be:

  • Global Landslide Susceptibility, where the value is an index representative of likelihood of occurrence due to phisical/morphological factors.
  • Similiar would apply to geomorphologic pluvial flood mapping - i.e. where water cumulates due to terrain morphology, withouth any frequency attached or reference to real events

Multiple occurrence metrics, return period

The reason we give multiple way to express the same concept of occurrence is trying to cover perspectives from different hazard practices:

  • For most common global hazard models, we usually work with 1/n (years) return period formulation. Then entering just "n" (years) is the easier option to explain the data.
  • More detailed modelling offer a better representation of frequency distribution, as Joh mentioned, which could also include seasonality.

An idea that I thrown in the example was to translate any of the different frequency formats inserted by the user into a common annual rate (event.occurrence) measured, expressed as percentage of occurrance in a given year. However, as Duncan said,

Consuming applications can summarise data in the way that users need, without the summary being available in the original data.

Event sets and events

Yes, the 1:n relationship should be mantained:

{
  "hazard": {
    "event_set": {
      "events": [
        {
          // event A
        },
        {
          // event B
        }
      ]
    }
  }
}

@odscrachel odscrachel added the hazard Issues related to Hazard data label May 30, 2023
@stufraser1
Copy link
Member

Also, stochastic event catalogues in cat models are too large to be listed in meta-data.

We won't be listing the whole event set; this would be in a data file. The metadata in the event object aims to describe the RP etc of an individual hazard map or scenario event. To describe the event set we'd use the event_set object, but do need to make sure we can be clear whether the events within that are in a data file or have their metadata nested under the event_set

@stufraser1
Copy link
Member

We would call the reciprocal of the return period not an occurrence probability but an 'event rate' or 'annual arrival rate'. This is the average number of occurrences of the event within a given year, not a probability of occurrence.

This may well be what was then named exceedance rate in the earlier versions. Agree it should be included. The exceedance probabilitydescription should be clear it is reporting probability of one occurrence in a given year.

@stufraser1
Copy link
Member

To address:

We would call the reciprocal of the return period not an occurrence probability but an 'event rate' or 'annual arrival rate'. This is the average number of occurrences of the event within a given year, not a probability of occurrence.
In order to work out the probability of 0,1,2.. occurrences within a given year, we would refer to a frequency distribution, such as poisson. From the assumed distribution and parameter (the event rate can be the lambda used to parameterise the Poisson, for example), we could calculate the probability of no occurrences, 1 occurrence, 2 occurrences of that event within a year etc.

Proposed changes:

Name Description Type
event_set.occurrence_range A summary of the probabilistic occurrence scenarios are included in the event_set. This field must only be used if event_set.analysis_type = "probabilistic" string
event.occurrence The occurrence of the event. object
event.occurrence.probabilistic The frequency for the hazard scenario (event). Can be Return Period and/or probability. Both fields are provided to allow entry of return period (most common term used across all hazards) and/or probability, which is commonly used for seismic hazard. Probability commonly refers to a probability within 1 year or 50 years, but may be relative to any duration - when probability is used, span must be specified. object
event.occurrence.probabilistic.return_period The event return period (or recurrence interval); the estimated average time between events. Expressed as "1/10" for a 1-in-10 year return period, etc. string
event.occurrence.probabilistic.event_rate The average number of occurrences of the event within a given year. This is the reciprocal of the return period, related by event_set.frequency_distribution float
event.occurrence.probabilistic.probability The occurrence probability of the modelled event. object
event.occurrence.probabilistic.probability.value The value of the occurrence probability expressed as a float - for example 0.1 = 10% probability. Requires event.occurrence.span to define the period in which the probability applies. float
event.occurrence.probabilistic.probability.span The duration (in years) used to specify the probability. integer
event.occurrence.empirical The time period over which the observed event took place. object
event.occurrence.empirical.time_start The event start date and time. ISO 8601
event.occurrence.empirical.time_end The event end date and time. This may be identical to time_start, or may be a data and time occurring after time_start ISO 8601
event.occurrence.deterministic The thresholds classification or index values used by the deterministic event set, index approaches and susceptibility data. string

@odscjen
Copy link
Contributor

odscjen commented Jun 19, 2023

Thanks for summarising this @stufraser1, just one suggested change to address

Are return periods always expressed with 1 as the numerator, e.g. 1/10, 1/20, 1/50, or is it possible to have a different numerator, e.g. 2/10, 5/50? If the numerator is always 1, then I would model the return period as a number representing the denominator, which will be easier to validate than a string.

and the answer

For most common global hazard models, we usually work with 1/n (years) return period formulation. Then entering just "n" (years) is the easier option to explain the data.

And just an addition to each of the 3 object type descriptions to make it clearer in which situation each is expected to be used. (and I added titles more for our benefit when we get to actually making this change to the schema)

Name Title Description Type
event_set.occurrence_range Occurrence range A summary of the probabilistic occurrence scenarios are included in the event_set. This field must only be used if event_set.analysis_type = "probabilistic" string
event.occurrence Occurrence The occurrence of the event. object
event.occurrence.probabilistic Probabilistic frequency The frequency for the hazard scenario (event). Can be Return Period and/or probability. Both fields are provided to allow entry of return period (most common term used across all hazards) and/or probability, which is commonly used for seismic hazard. Probability commonly refers to a probability within 1 year or 50 years, but may be relative to any duration - when probability is used, span must be specified. This object must only be used if event_set.analysis_type = "probabilistic" object
event.occurrence.probabilistic.return_period Return period The event return period (or recurrence interval); the estimated average time between events. Expressed as the denominator of "1/n", e.g. "10" for a 1-in-10 year return period etc. integer
event.occurrence.probabilistic.event_rate Event rate The average number of occurrences of the event within a given year. This is the reciprocal of the return period, related by event_set.frequency_distribution float
event.occurrence.probabilistic.probability Probability The occurrence probability of the modelled event. object
event.occurrence.probabilistic.probability.value Probability value The value of the occurrence probability expressed as a float - for example 0.1 = 10% probability. Requires event.occurrence.span to define the period in which the probability applies. float
event.occurrence.probabilistic.probability.span Probability span The duration (in years) used to specify the probability. integer
event.occurrence.empirical Empirical The time period over which the observed event took place. This field must only be used if event_set.analysis_type = "empirical" object
event.occurrence.empirical.time_start Event start time The event start date and time. ISO 8601
event.occurrence.empirical.time_end Event end time The event end date and time. This may be identical to time_start, or may be a data and time occurring after time_start ISO 8601
event.occurrence.deterministic Deterministic frequency The thresholds classification or index values used by the deterministic event set, index approaches and susceptibility data. This field must only be used if event_set.analysis_type = "deterministic" string

@stufraser1
Copy link
Member

Suggest to change title 'Deterministic frequency' as this isn't a frequency, to 'Deterministic approach' or 'Deterministc measure'?
Should this be an object? its the only case that isn't an object here.

Additionally, a deterministic event can be assigned a return period. In that case we would use event_set.analysis_type = "deterministic" and event.occurrence.probabilistic.return_period (event.occurrence.deterministic.return_period?)

So we would have
event.occurrence.deterministic (object)
and under that:
event.occurrence.deterministic.return_period (for the RP of a deterministic scenario event)
event.occurrence.deterministic.index_values
event.occurrence.deterministic.susceptibility

@matamadio to check

@odscjen
Copy link
Contributor

odscjen commented Jun 20, 2023

Should this be an object? its the only case that isn't an object here.

No it's fine to leave this as a field, unless you do want to put in the other potential fields mentioned. Happy to leave this until @matamadio has had a chance to review

@matamadio
Copy link
Contributor Author

matamadio commented Jul 4, 2023

Suggest to change title 'Deterministic frequency' as this isn't a frequency, to 'Deterministic approach' or 'Deterministc measure'? Should this be an object? its the only case that isn't an object here.

Additionally, a deterministic event can be assigned a return period. In that case we would use event_set.analysis_type = "deterministic" and event.occurrence.probabilistic.return_period (event.occurrence.deterministic.return_period?)

Politely disagreeing on both points: a deterministic layer describes the likelyhood of event (e.g. landslide susceptibility: likely to occur, unlikely to occur...), in that sense it has a frequency attribute. However, the frequency is assessed qualitatively (index, ranking, etc) and not tied to a specific occurrence probability, rather using the mean or median. If the layer describes an index value linked to specific return period, then event.occurrence.probabilistic.return_period should be used instead of event.occurrence.deterministic.

Also not clear to me what would be the difference between these two attributes:

event.occurrence.deterministic.index_values
event.occurrence.deterministic.susceptibility

@stufraser1
Copy link
Member

stufraser1 commented Jul 5, 2023

@matamadio your points are well taken.
The use case I'm thinking of is a deterministic scenario where we have a footprint of an event assessed as a 1:100-year flood. We need somewhere to note that, and I had understood event.occurrence.probabilistic.return_period should only be used for probabilistic analyses, which doesn't fit the case of this deterministic scenario analysis (so suggested an alternative field that could be used under the deterministic` object. @matamadio where/how do you see a deterministic scenario footprint/map being described?

We need to be very clear in explaining the use of 'frequency' for these attributes because I think many data users/producers will more readily associate frequency with quantitative measure / probabilistic analysis rather than the qualitative way described above.

@stufraser1
Copy link
Member

Also not clear to me what would be the difference between these two attributes:
event.occurrence.deterministic.index_values
event.occurrence.deterministic.susceptibility

I did not explain well and I think actually we only need event.occurrence.deterministic.index_values

@odscrachel
Copy link
Contributor

Can I check where we are with this and fill in the blanks below. Does this represent what has been agreed so far?

Name Title Description Type
event_set.occurrence_range Occurrence range A summary of the probabilistic occurrence scenarios are included in the event_set. This field must only be used if event_set.analysis_type = "probabilistic" string
event.occurrence Occurrence The occurrence of the event. object
event.occurrence.probabilistic Probabilistic frequency The frequency for the hazard scenario (event). Can be Return Period and/or probability. Both fields are provided to allow entry of return period (most common term used across all hazards) and/or probability, which is commonly used for seismic hazard. Probability commonly refers to a probability within 1 year or 50 years, but may be relative to any duration - when probability is used, span must be specified. This object must only be used if event_set.analysis_type = "probabilistic" object
event.occurrence.probabilistic.return_period Return period The event return period (or recurrence interval); the estimated average time between events. Expressed as the denominator of "1/n", e.g. "10" for a 1-in-10 year return period etc. integer
event.occurrence.probabilistic.event_rate Event rate The average number of occurrences of the event within a given year. This is the reciprocal of the return period, related by event_set.frequency_distribution float
event.occurrence.probabilistic.probability Probability The occurrence probability of the modelled event. object
event.occurrence.probabilistic.probability.value Probability value The value of the occurrence probability expressed as a float - for example 0.1 = 10% probability. Requires event.occurrence.span to define the period in which the probability applies. float
event.occurrence.probabilistic.probability.span Probability span The duration (in years) used to specify the probability. integer
event.occurrence.empirical Empirical The time period over which the observed event took place. This field must only be used if event_set.analysis_type = "empirical" object
event.occurrence.empirical.time_start Event start time The event start date and time. ISO 8601
event.occurrence.empirical.time_end Event end time The event end date and time. This may be identical to time_start, or may be a data and time occurring after time_start ISO 8601
event.occurrence.deterministic Deterministic frequency The thresholds classification or index values used by the deterministic event set, index approaches and susceptibility data. This object must only be used if event_set.analysis_type = "deterministic" object
event.occurrence.deterministic.return_period Return period
event.occurrence.deterministic.index_values Index values

@matamadio
Copy link
Contributor Author

matamadio commented Jul 5, 2023

@matamadio your points are well taken. The use case I'm thinking of is a deterministic scenario where we have a footprint of an event assessed as a 1:100-year flood. We need somewhere to note that, and I had understood event.occurrence.probabilistic.return_period should only be used for probabilistic analyses, which doesn't fit the case of this deterministic scenario analysis (so suggested an alternative field that could be used under the deterministic` object. @matamadio where/how do you see a deterministic scenario footprint/map being described?

IMHO, an event framed as "1:100", even when used as single "mean representative" scenario, is still a slice of a probabilistic dataset; hence event.occurrence.probabilistic.return_period should be used. The fact that the rest of the distribution is not calculated or provided doesn't change its nature.

The deterministic layers I'm thinking about are usually produced combining variables in an index, and do not relate to any RP, e.g.:

Landslide susceptibility: DEM (slope) + geology (soil type) + land cover (soil cover type) = index value
River flood susceptibility: DEM (morphology) + hydro network (distance from river) = index value

@duncandewhurst duncandewhurst mentioned this issue Jul 5, 2023
2 tasks
@odscjen
Copy link
Contributor

odscjen commented Jul 7, 2023

@stufraser1 @matamadio is there a consensus yet for what to do with 'deterministic'?

@stufraser1
Copy link
Member

@matamadio to propose type and short description for event.occurrence.deterministic.index_values consistent in style with descriptions elsewhere in above table please.

@stufraser1
Copy link
Member

stufraser1 commented Jul 8, 2023

IMHO, an event framed as "1:100", even when used as single "mean representative" scenario, is still a slice of a probabilistic dataset; hence event.occurrence.probabilistic.return_period should be used.

If ee want to assign a return period to an empirical event would you propose using event.occurrence.probabilistic.return_period too?

I guess we could use it as long as guidance is clear that it can be used for these purposes as well as for defining e.g. the return period of hazard map. It makes sense, but the nature of the object changes, and should remove the restriction "This object must only be used if event_set.analysis_type = "probabilistic""

Proposed description for event.probabilistic is then:
The frequency for the hazard scenario (event). Can be Return Period and/or probability. Both fields are provided to allow entry of return period (most common term used across all hazards) and/or probability, which is commonly used for seismic hazard. Probability commonly refers to a probability within 1 year or 50 years, but may be relative to any duration - when probability is used, span must be specified. This object can be used to describe the probability of return period hazard maps, or the defined probability of single event footprints (historical or hypothetical sceanrios). It can be used with any event_set.analysis_type.

@matamadio
Copy link
Contributor Author

matamadio commented Jul 10, 2023

If we want to assign a return period to an empirical event would you propose using event.occurrence.probabilistic.return_period too?

I rather propose to use an additional 'event.occurrence.empirical.return_period' here, with the description:

"Probabilistic frequency estimate associated with the empirical events in terms of hazard intensity".

@matamadio to propose type and short description for event.occurrence.deterministic.index_values consistent in style with descriptions elsewhere in above table please.

Proposing two new fields for 'event.occurrence.deterministic':

  • event.occurrence.deterministic.index_criteria: explains the criteria according to which the deterministic index values are calculated, e.g. mean, max, min, others... (description).
  • event.occurrence.deterministic.thresholds: specify the thresholds (if any) used to classify the index value. Array (open)

Summary table (bottom part, top unchanged)

Name Title Description Type
event.occurrence.empirical Empirical The time period over which the observed event took place and the associated return period (if any). This field must only be used if event_set.analysis_type = "empirical" object
event.occurrence.empirical.time_start Event start time The event start date and time. ISO 8601
event.occurrence.empirical.time_end Event end time The event end date and time. This may be identical to time_start, or may be a data and time occurring after time_start ISO 8601
event.occurrence.empirical.return_period Associated return period Probabilistic frequency estimate associated with the empirical events in terms of hazard intensity integer
event.occurrence.deterministic Deterministic frequency The index criteria and thresholds classification used by the deterministic event set, index approaches and susceptibility data. This object must only be used if event_set.analysis_type = "deterministic". object
event.occurrence.deterministic.index_criteria Index criteria Full description of the approach and criteria used to produce the index value. A deterministic hazard intensity index (ranking, score, etc) is not tied to a specific occurrence probability, rather is produced using an aggregation criteria (e.g. max, mean, median of annual values over a period; multi-criteria combination; Principal Component Analysis; else). string
event.occurrence.deterministic.thresholds Index thresholds The thresholds used to classify the index value array

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hazard Issues related to Hazard data proposal New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants