Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to choose which "variable combinations" to use in a study? #101

Closed
DenoBeno opened this issue Sep 17, 2019 · 26 comments
Closed

How to choose which "variable combinations" to use in a study? #101

DenoBeno opened this issue Sep 17, 2019 · 26 comments
Assignees
Labels
BB: Scenario Management Scenario Management Building Block enhancement New feature or request

Comments

@DenoBeno
Copy link

DenoBeno commented Sep 17, 2019

Related to: clarity-h2020/data-package#42, clarity-h2020/data-package#47, clarity-h2020/data-package#46, clarity-h2020/data-package#45
Possibly also related to https://github.com/clarity-h2020/emikat/issues/18

As discussed earlier today, we are overwhelming the users by letting them see all the possible combinations of the time_period (baseline+3), emission_scenario (baselive+3) and event frequency (3) = 27 or 48 combinations, depending how we count. From the users point of view both 27 and 48 qualify as "too many to deal with".

Consequently, we need a way to limit the number of combinations shown in some way. The question is "how"?

@DenoBeno
Copy link
Author

My first idea is the following: Let the user choose a combination of the parameters at some point in the study definition.

This could be done e.g. in the following way:

  1. let the user name some combinations of parameter. Effectively, something like the table below
    image

  2. Use these named combinations in maps, tables etc., rather than showing all possible combinations. E.g. in the map, or in the table the users would only be able to choose "My baseline", "My frequent event" and "My rare event". Which are just labels the user has named themself.

The advantage of this approach is that it's up to the user to decide if they want to keep the time period, emission scenation or event frequency fixed/which one will change between their "variants"

This sounds like a reasonable approach to me, except that I'm not sure:

  1. Where in the study the user should be able to define these named parameter combinations. This could be e.g. right in the beginning, when a study is defined or later, e.g. in the risk/impact tab. (Or both?). If it's in the risk/impact tab, where exactly? Under "data"? Or in the "table" tab?
  2. How should we call these named parameter combinations? Are these the "study events"? Something else?

@DenoBeno DenoBeno added help wanted Extra attention is needed question Further information is requested labels Sep 17, 2019
@p-a-s-c-a-l p-a-s-c-a-l added the on-hold Issue is on-hold and will be adressed later or closed label Sep 17, 2019
@p-a-s-c-a-l p-a-s-c-a-l added this to Backlog: High Priority in T1.3 Climate Services Co-creation via automation Sep 17, 2019
@p-a-s-c-a-l p-a-s-c-a-l added this to the D1.4 CLARITY CSIS v2 milestone Sep 17, 2019
@RobAndGo
Copy link

RobAndGo commented Sep 17, 2019

This is perhaps just a guess, but perhaps the user may not have sufficient knowledge to decide what emissions (RCP) scenario to use. Maybe it could be simpler just to use a default scenario of RCP45 together with the baseline as a reference - meaning one less thing for the user to decide.

Then, once the output from this scenario is presented, an option may be presented where the user can select one of the other emissions scenarios as a comparison?

The time period could also be selected by the user from the start, with an option, that once the data is shown, another time period could be selected for comparison.

@DenoBeno
Copy link
Author

@patrickkaleta , @therter : if we decide to go this way, such named combinations would need to be stored somewhere. Possibilities:

  • in the group itself or in the ("risk/impact") GL-step node?
  • as paragraph or as yet another node type?
  • other?

What would be the easiest to use by the "applications", e.g. by CISMET map application?

FYI: I'm inclined to say "a paragraph in the group" because we could have the resources that use various variables in any step and because there is IMO no need to reuse this across studies. For the same reason, I'm inclined to say "let the user define this right at the start", but I'm not sure if that's really appropriate from the workflow point of view

@ALL: WDYT?

@DenoBeno
Copy link
Author

DenoBeno commented Sep 17, 2019

This is perhaps just a guess, but perhaps the user may not have sufficient knowledge to decide what emissions (RCP) scenario to use. Maybe it could be simpler just to use a default scenario of RCP45 together with the baseline as a reference - meaning one less thing for the user to decide.

Then, once the output from this scenario is presented, an option may be presented where the user can select one of the other emissions scenarios as a comparison?

The time period could also be selected by the user from the start, with an option, that once the data is shown, another time period could be selected for comparison.

Yes, I was also thinking about letting the user choose parameters at the start and then be able to change them in the risk/impact section if they want.
Or even having some presets when the user starts the study that they can change if they want. Which sounds like a good idea, except that we only have templates for the "steps" and not for the study group and I'm not sure if the "default value" for a field can have more than one value and what happens if I give a default value to a paragraph field...

By the way, I already thought of providing additional information on RPCs - this is already in the system. We just need to make use of it.

image

@DenoBeno
Copy link
Author

@p-a-s-c-a-l : I don't think that it's a good idea to postpone this issue because this would mean that @therter has to implement a map application first in such a way that all the combinations are shown (that's 27 or 48 layers for each resource in the DP that uses these variables, ups!) and later change this.

It's IMO less work if we resolve this issue first. Once that's done, we will be able to count on the number of layers that needs to be shown in a map being "reasonable"; 10-20 rather than 100+

@p-a-s-c-a-l
Copy link
Member

No, we don't show all possible layer combinations. I'm implementing the map component and the new table component. As starting point, choices are preselected (hardcoded) as suggested by Robert, Therefore no need for making a selection ATM.

@DenoBeno
Copy link
Author

Ah, I understand. So we will pre-set some combinations and then the user choice can be added at a later time. Yes, makes sense and this will not cause work duplication as I feared. Good, let's put this "on hold" for now.

Before I go away, here is a second design variant I thought of, for the reference:

image

Reasoning:

  • "label" could be automatically generated from the three parameters, so we could do without it.
  • if our users don't know what rcp45 stands for, we could use short description. Or call them anything else that makes sense to users. E.g. "too optimistic", "somewhat optimistic" and "Trump stays a president". .-)

@RobAndGo
Copy link

As a tag for the resources, the "rcp45" scenario is classified as "effective-measures" with respect to greenhouse gas emissions. For completeness, rcp26 = early-response, and rcp85 = business-as-usual.

@patrickkaleta
Copy link

@patrickkaleta , @therter : if we decide to go this way, such named combinations would need to be stored somewhere. Possibilities:

  • in the group itself or in the ("risk/impact") GL-step node?
  • as paragraph or as yet another node type?
  • other?

What would be the easiest to use by the "applications", e.g. by CISMET map application?

I would also store it directly in the group and not the RIA GL-step (What if RIA step not available in selected Study type? What if this information is needed in a step prior to the RIA step? ...).

Our components (map application, etc.) will most likely access this information via JSON:API, so I'm not sure if it would be easier to have it in a paragraph or a new node type - JSON:API and Group content have their issues and JSON:API with Paragraphs have not yet been tested by us (AFAIK).

So, when we go back working on this, I would first test accessing paragraphs in a group via JSON:API and then make a decision on how to store this information.

@DenoBeno DenoBeno added workaround implemented and removed help wanted Extra attention is needed labels Sep 21, 2019
@DenoBeno
Copy link
Author

I have added a new "workaround implemented" flag to indicate that this isn't just on hold because we can't fix it, but that we simply don't need to fix it at the moment.

@p-a-s-c-a-l
Copy link
Member

Not sure which "workaround " you mean, AFAIK this isn't working yet.

The additional EMIKAT time_period, mission_scenario and event_frequencyparameters are now supported in MapComponent and partially in Table Component.

In order to initialise the respective Map/Table iFrames with the proper query parameters, we have to provide a possibility to chose the parameters in CSIS and then include the choices in the CSIS_Helpers Module's JavaScript Object. So in principle I don't care where this information is actually stored (paragraph, etc.) als long as it is made available in JavaScript and not Drupal JSON:API.

@DenoBeno
Copy link
Author

Decision: add a paragraph that allows user to set the three variables and give it a name

Later extensions:

  1. allow users to choose more than one such combination so that they can be compared
  2. Add a fourth variable for the study variant.

This should be ideally positioned right after choosing the study data package, so that we can provide immediate feedback if the chosen combination makes sense or not (also future extension).

@DenoBeno
Copy link
Author

DenoBeno commented Sep 23, 2019

As a tag for the resources, the "rcp45" scenario is classified as "effective-measures" with respect to greenhouse gas emissions. For completeness, rcp26 = early-response, and rcp85 = business-as-usual.

@RobAndGo : what is the meaning of Rare/Occasional/Frequent?

@DenoBeno
Copy link
Author

DenoBeno commented Sep 23, 2019

I have added a field_var_meaning to dp_variables taxonomy (Label: "CSIS label (variable meaning)") and configured the taxonomy terms as shown below:

image

The title is automatically built from the variable name, value and label/meaning on save. Thus, we can have this now:

image

First one to be used with EMIKAT, the second with heat hazard data.

@RobAndGo
Copy link

RobAndGo commented Sep 23, 2019

@RobAndGo : what is the meaning of Rare/Occasional/Frequent?

@DenoBeno
These refer to how often the event occurs. We defined:

  • Rare event = 1 event in 20 years (probability of occurrence = 0.05)
  • Occasional event = 1 event in 5 years (probability of occurrence = 0.2)
  • Frequent event = 1 event per year (probability of occurrence = 1.0)

@DenoBeno
Copy link
Author

Oh jee... This with the field_var_meaning in the dp_variables will not work as I hoped.

  1. we will probably need a human-readable and a machine-readable variant (id) of the name.
  2. Even worse, we have to assure that these labels are the same in dp_variables taxonomy and in the place where users are choosing what they want shown in the study.

In short, we need a help-taxonomy called "dp variable meanings".

@patrickkaleta patrickkaleta added enhancement New feature or request and removed on-hold Issue is on-hold and will be adressed later or closed labels Sep 23, 2019
@DenoBeno
Copy link
Author

OK, here is the initial implementation:

  1. DP_variables taxonomy. To be used in defining the resource tags https://csis.myclimateservice.eu/admin/structure/taxonomy/manage/dp_variables/overview

  2. variables subqueue of Tags Entity queue https://csis.myclimateservice.eu/admin/structure/entityqueue/dp_tags/variables?destination=/admin/structure/entityqueue/dp_tags/list

  3. entity browser view for this queue https://csis.myclimateservice.eu/admin/structure/views/view/entity_browser_for_taxonomies/edit/entity_browser_12

  4. used in tags entity browser (which is used in dp_resource node type) https://csis.myclimateservice.eu/admin/config/content/entity_browser/data_packages_tags/edit

  5. DP_variables_meanings taxonomy. to be used in defining which variable combinations to use in a study and referenced from dp_variables. https://csis.myclimateservice.eu/admin/structure/taxonomy/manage/dp_variable_meanings/overview

  6. Queue with a couple of subqueues for this vocabulary. https://csis.myclimateservice.eu/admin/structure/entityqueue/study_variables

  7. Paragraph where one can define a combination of variables: https://csis.myclimateservice.eu/admin/structure/paragraphs_type/variable_set

  8. Used in a study group
    image

image

image

@DenoBeno
Copy link
Author

Implementation status:

  1. It's possible to define one "preset" for the study, adding more than one would be trivial.
  2. Everything is set up in a way that will allow us to relate these presets with allowed variable values in resources.

@patrickkaleta : your turn to make use of this now...

@patrickkaleta
Copy link

patrickkaleta commented Sep 24, 2019

Not sure which "workaround " you mean, AFAIK this isn't working yet.

The additional EMIKAT time_period, mission_scenario and event_frequencyparameters are now supported in MapComponent and partially in Table Component.

In order to initialise the respective Map/Table iFrames with the proper query parameters, we have to provide a possibility to chose the parameters in CSIS and then include the choices in the CSIS_Helpers Module's JavaScript Object. So in principle I don't care where this information is actually stored (paragraph, etc.) als long as it is made available in JavaScript and not Drupal JSON:API.

I've updated the studyInfo object in the csis helper module to include the new fields inside the study_presets attribute. The studyInfo object now looks like this (for Study 30):

 "csisHelpers":{ 
      "studyInfo":{ 
         "id":"30",
         ...
         "eea_city_name":"Dublin",
         "study_presets":{
            "time_period":"period:2041-2070",
            "emission_scenario":"scenario:rcp45",
            "event_frequency":"frequency:rare"
         }
      },

For the values I'm using whatever is stored for each of these DP variable meanings terms inside the meaning ID field. @p-a-s-c-a-l can you work with that or do you need additional adjustments?

ATM these new values are only shown when visiting a GL-Step, on the Study overview page itself these values are not yet available. That will be the next step for me.
Update: The new values are now also available on the Study page.

@p-a-s-c-a-l
Copy link
Member

Thanks, I've implemented it and it seems it work. At least the parameters are used in the table and map app, but there are still several problems with EMIKAT.

I think we can now merge csis-helpers-module/feature/010-extend-entityinfo into dev.

@patrickkaleta
Copy link

I think we can now merge csis-helpers-module/feature/010-extend-entityinfo into dev.

Ok I will do that. But first I will add the study_variant to the studyInfo object, since I saw that you requested it as well here.

@patrickkaleta
Copy link

Done. study_variant is now included as well and feature has been merged into dev.

@DenoBeno
Copy link
Author

DenoBeno commented Oct 1, 2019

This is basically done now. I'll open a new issue for allowing more than one preset and close this thread now.

@DenoBeno DenoBeno closed this as completed Oct 1, 2019
T1.3 Climate Services Co-creation automation moved this from Backlog: High Priority to Done Oct 1, 2019
@p-a-s-c-a-l p-a-s-c-a-l added the BB: Scenario Management Scenario Management Building Block label Oct 30, 2019
@p-a-s-c-a-l p-a-s-c-a-l added this to To do in T4.3 Scenario Management via automation Oct 30, 2019
@p-a-s-c-a-l p-a-s-c-a-l moved this from To do to Done in T4.3 Scenario Management Nov 6, 2019
@p-a-s-c-a-l
Copy link
Member

p-a-s-c-a-l commented Dec 12, 2019

I couldn't figure out how this "variable meaning" stuff is supposed to work. As far as I understood, it could (?) provide a solution to the following problem:

For the variable "Historical Time" (${time_period}) period two different values are used:

  • Hazard Layers: 19710101-20001231
  • EMIKAT: Baseline

Same problem for ${emissions_:scenario}.

Now, in one Map there are both Hazard and EMIKAT layers. Which means URIs with ${time_period} variables that have to be replaced by historical_19710101-20001231 in the one case and by Baseline in the other case.

This is rater unfortunate, since it is a "self-made problem" that adds a lot of unnecessary complexity to the codebase. So instead of trying to implement around a problem that we accidentally invented by ourselves, I propose to harmonise the variable values and get rid of the problem in the first place. Which means EMIKAT should accept historical_19710101-20001231 instead of Baseline as value for the historical time period. WDYT @humerh

@p-a-s-c-a-l p-a-s-c-a-l reopened this Dec 12, 2019
T1.3 Climate Services Co-creation automation moved this from Done to In Progress Dec 12, 2019
T4.3 Scenario Management automation moved this from Done to In progress Dec 12, 2019
@RobAndGo
Copy link

@p-a-s-c-a-l, sorry but I had to take care of a lot of other stuff this week. What do you need from me? Should I start to upload the precipitation datasets into the data package?

@p-a-s-c-a-l
Copy link
Member

No, ATM it's not working yet and I had to take care about the periodic report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BB: Scenario Management Scenario Management Building Block enhancement New feature or request
Projects
No open projects
Development

No branches or pull requests

5 participants