multirun w/ "output_frequency" and "output_frequency_n" produces incomplete results #116

nkruskamp · 2021-05-31T14:34:56Z

I am trying to use pops_multirun with a 10 year simulation.

I only want outputs every 5 years, so I set output_frequency=year and output_frequency_n=5.

However, I still get outputs for all 10 years of the simulation. This is not the expected behavior. I would expect output rasters to have only two layers, not 10.

A second issue is the results are incomplete spatially. See below for example: the gray area at the top of the region should have results. This does not occur when these parameters are left default.

The text was updated successfully, but these errors were encountered:

wenzeslaus · 2021-06-01T14:02:48Z

output_frequency_n is meant to be used with output_frequency="every_n_steps". The documentation is probably incomplete, but it should cover your use case.

nkruskamp · 2021-06-01T14:36:42Z

I'm not sure I understand what you mean.

If I wanted an output every 5 years, what is the correct set of parameters to achieve that? This is what I currently tried:

output_frequency = "year",
output_frequency_n = 5,

The weather data is in weekly time steps, which is controlled by time_step = "week", so that is not related. Correct?

In either case, however, the results I'm getting in the example image are not usable and I'm not sure where that error is occurring.

wenzeslaus · 2021-06-01T14:57:25Z

I don't know about the spatial error. That looks serious, but what you need for output is:

output_frequency = "every_n_steps",
output_frequency_n = x,

where x is a number of weeks in a year times 5 assuming you have time_step = "week".

In other words, output_frequency is generally dependent on time_step. Then, monthly, yearly, etc. are special cases. Finally, year is a synonym for yearly, month for monthly. So it works differently than for time step which is creating confusion.

The reason is that this was a reasonable way how to support the different use cases without making things too complicated. Maybe something to revisit, definitively something to document better.

ChrisJones687 · 2021-06-01T15:20:04Z

Also not sure about the spatial error. I will take a look @nfkruska is this happening at all scales and can you point me to the code for running this.

I opened an issue in pops-core for the output_frequncy pops-core #142.

nkruskamp · 2021-06-02T01:33:03Z

@wenzeslaus I misunderstood and thought you were using "every_n_steps" as an alias for year month etc. I get it now! I will try to update my parameters and see what happens.

@ChrisJones687 I will need to do some more testing to 1) correct my use of the parameters and if that doesn't fix the issue, 2) see if scale is playing a role. The code is in the NCSU repo under 7c_pops_multirun.R

nkruskamp · 2021-06-02T16:25:18Z

Actually it looks like every_n_steps isn't an accepted argument here. See here:

rpops/R/checks.R

Line 281 in 41ffc17

if (checks_passed && !(output_frequency %in% list(

But I'm a bit confused then on the logic of how to specify the output_frequency_n if output_frequency != time_step. Is that parameter still given in the measure of time_step?

ChrisJones687 · 2021-06-02T17:18:45Z

I was planning on adding a modifier for output_frequency and output_frequency_n so that it would do conversions for the user from time steps to other units. There was never a use case for it previously, and I forgot to add a To Do in the code so that I knew to come back to this.

I might do this in the scheduling core and then we don't have to solve it in all interfaces.

nkruskamp · 2021-06-02T17:21:47Z

Where else might the parameter be used? I am trying to reduce the RAM usage on multirun since the extra outputs max out memory quickly.

wenzeslaus · 2021-06-02T19:32:07Z

Hm, the rpops and PoPS Core handling of this is really different. "Number of units" in time step is always 1 (set_step_num_units(1)) (aka m below). Thanks to that, time_step in rpops works as every_n_steps in Core.

Perhaps the naming in Core is just bad, but you can already do output daily, weekly, monthly, yearly, each step, and each nth step. What you can't do now in Core is to combine different units in simulation step and output step while outputting only every nth of the whatever unit you used, e.g., you can't say "output every two months" with simulation step "day". However, you can do every nth day, week, month, or year, although that's only if your simulation step is (one) day, week, month, or year, respectively, i.e., the units need to match. If you simulation step is more than one of the time unit, e.g., m days, than your output can still be something meaningful, but you are constrained to n being multiple of m, e.g., simulation step two day with output every four days.

I think it is possible to allow for arbitrary combinations in Core, but it didn't seem worth the trouble so far. What might be sufficient is to allow specifying output frequency in the same units as the simulation units if the n is a multiple of m. For example, you could say "simulation step is 2 days and output frequency/step is 4 days", but saying "simulation step is 2 days and output frequency/step is 5 days" would fail and "simulation step is 30 days and output frequency/step is 2 months" would fail too. However, this seem like a pain to check this ahead of time in R consistently with the Core, but perhaps there should be exception catching code in src/pops.cpp to handle these cases.

wenzeslaus · 2021-06-02T19:53:26Z

Where else might the parameter be used?

I'm not sure what you mean, but perhaps my other comment helps.

I am trying to reduce the RAM usage on multirun since the extra outputs max out memory quickly.

Well, it would defeat the whole point of convenience, but the r.pops.spread in GRASS does not store the outputs in memory, it outputs them to disk. Also the weather code there is ready for some potential memory consumption reduction.

nkruskamp · 2021-06-02T20:47:19Z

What you can't do now in Core is to combine different units in simulation step and output step while outputting only every nth of the whatever unit you used

So it sounds like I need to set time_step = "week", output_frequency = "time_step", output_frequency_n = (5 * 52) to get an output every 5 years? That way I'm using a consistent unit throughout?

I see the convenience factor by giving some control over output_frequency separately from time_step but it does seem to introduce an extra step and confusion from the users side. I think from that perspective it's easier for the user to understand everything in a single unit and convert accordingly.

Where else might the parameter be used?

My assumption is in pops_multirun it is used to control how often a raster output is saved. So when I see defaults set as output_frequency = "year" and output_frequency_n = 1, I take that to mean a 10 year simulation with have 10 output layers, 1 for every year. What I was trying to do was set output_frequency_n = 5 so that a raster layer is saved only every 5 years. So my question here means, can this parameter be used to control that same output for other methods like validate where I could have the function save 5 and 10 year validation data. However, right now the validate function saves the output at the end of the whole duration of the simulation, so can I control that to avoid running validate with different duration?

re: GRASS

I'm not sure I can switch my workflow to GRASS right now, I'll need to see how much effort that might take. But currently there is a lot that won't run for me (see my other issue #115 ) that's been limiting on the data side of things. But, saving to disc and doing statistics at the end may need to be considered in the python and R implementations, because I think what is currently happening, especially for multirun, is 100 iterations of data x the output frequency is being held at once in addition to the input data. So for my 10 year simulation, that's 1000 2d arrays held in memory. I could be way off on my understanding of how that is actually working.

ChrisJones687 · 2021-06-03T13:59:07Z

As soon as a merge the PR it should work with.

time_step = "week", output_frequency = "every_n_steps", output_frequency_n = (5 * 52)

nkruskamp · 2021-06-03T14:34:38Z

Thanks @ChrisJones687 !

nkruskamp · 2021-06-03T17:19:36Z

This is my bad for opening what should probably be two issues in one, but the incomplete spatial results are still an issue. I changed 4 parameters at once, seasonality and output_frequency (ex below). I have since removed the output_frequency parameters but still get incomplete results. I will need to investigate where that issue is coming from.

  season_month_start = 2,
  season_month_end = 9,
  # output_frequency = "time_step",
  # output_frequency_n = (5 * 52),

ChrisJones687 · 2021-06-03T17:49:19Z

I have moved the spatial issue to a separate issue #119.

ChrisJones687 mentioned this issue Jun 1, 2021

Better documentation or more flexible option for frequency of outputs ect. ncsu-landscape-dynamics/pops-core#142

Open

ChrisJones687 linked a pull request Jun 3, 2021 that will close this issue

Update core #118

Merged

ChrisJones687 closed this as completed in #118 Jun 3, 2021

ChrisJones687 mentioned this issue Jun 3, 2021

Outputs spatially incomplete under certain scenarios. #119

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multirun w/ "output_frequency" and "output_frequency_n" produces incomplete results #116

multirun w/ "output_frequency" and "output_frequency_n" produces incomplete results #116

nkruskamp commented May 31, 2021

wenzeslaus commented Jun 1, 2021

nkruskamp commented Jun 1, 2021 •

edited

Loading

wenzeslaus commented Jun 1, 2021

ChrisJones687 commented Jun 1, 2021 •

edited

Loading

nkruskamp commented Jun 2, 2021 •

edited

Loading

nkruskamp commented Jun 2, 2021

ChrisJones687 commented Jun 2, 2021

nkruskamp commented Jun 2, 2021

wenzeslaus commented Jun 2, 2021

wenzeslaus commented Jun 2, 2021

nkruskamp commented Jun 2, 2021

ChrisJones687 commented Jun 3, 2021

nkruskamp commented Jun 3, 2021

nkruskamp commented Jun 3, 2021

ChrisJones687 commented Jun 3, 2021

multirun w/ "output_frequency" and "output_frequency_n" produces incomplete results #116

multirun w/ "output_frequency" and "output_frequency_n" produces incomplete results #116

Comments

nkruskamp commented May 31, 2021

wenzeslaus commented Jun 1, 2021

nkruskamp commented Jun 1, 2021 • edited Loading

wenzeslaus commented Jun 1, 2021

ChrisJones687 commented Jun 1, 2021 • edited Loading

nkruskamp commented Jun 2, 2021 • edited Loading

nkruskamp commented Jun 2, 2021

ChrisJones687 commented Jun 2, 2021

nkruskamp commented Jun 2, 2021

wenzeslaus commented Jun 2, 2021

wenzeslaus commented Jun 2, 2021

nkruskamp commented Jun 2, 2021

ChrisJones687 commented Jun 3, 2021

nkruskamp commented Jun 3, 2021

nkruskamp commented Jun 3, 2021

ChrisJones687 commented Jun 3, 2021

nkruskamp commented Jun 1, 2021 •

edited

Loading

ChrisJones687 commented Jun 1, 2021 •

edited

Loading

nkruskamp commented Jun 2, 2021 •

edited

Loading