Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sensitivity wishlist #43

Closed
6 tasks
iantaylor-NOAA opened this issue May 24, 2021 · 30 comments
Closed
6 tasks

Sensitivity wishlist #43

iantaylor-NOAA opened this issue May 24, 2021 · 30 comments
Labels
sensitivity w/ average priority ideas that could be sensitivity analyses sensitivity w/ low priority ideas that are on the wishlist but are lower priority topic: model Pertains to an SS model run

Comments

@iantaylor-NOAA
Copy link
Collaborator

iantaylor-NOAA commented May 24, 2021

I'm not sure if it makes sense to create separate issues to track high and low priority sensitivity ideas or pile them all in one.
Here's a low-priority one from #27 (comment).

  • test the impact on the 2019 models of changing the month value for all the survey observations to 7 to test Ian's belief that the extra complexity of not assigning everything to the middle of the year will have only a tiny impact on the model results and not be worth the extra complexity.
  • Recreational CAAL (DATA 2 Recreational Data #20 )
  • Adjusting recreational comps by sex. Previous model used sexed (WA), unsexed (OR and CA) comps. Currently, we have a both sexed and unsexed comps for all rec fleets
  • If an Oregon recreational onboard (CPFV ride along) index becomes available before the assessment is complete (@aliwhitman is focused on ORBS and nearshore logbook first), it could be useful to test the sensitivity to including it in the model either instead of the ORBS index or via a separate fleet with mirrored selectivity to the OR rec fleet. The 2017 assessment report noted "Note that the base assessment model does not use both the OR onboard index as well as the OR dockside as they show similar trends. The dockside index is used due to the longer time series."
  • Explore option of using option for length and age compositions of "combM+F: males and females treated as combined gender below this bin number" for survey data, and possibly other data sources. Sex ratios are highly variable at small so, and this could potential smooth out the uncertainty in determining sex for those small sizes.
  • Change California Recreational Index in recent years from onboardCPFV to CRFSPR (choose what CA recreational CPUE to use #62)
@iantaylor-NOAA iantaylor-NOAA added topic: model Pertains to an SS model run sensitivity w/ average priority ideas that could be sensitivity analyses sensitivity w/ low priority ideas that are on the wishlist but are lower priority labels May 24, 2021
@kellijohnson-NOAA
Copy link
Collaborator

I think a combination of both is fine, where sensitivities without a larger current discussion can go here, but we can also tag other issues with one of the appropriate "sensitivity" tags.

@kellijohnson-NOAA
Copy link
Collaborator

TOR says we must estimate a single M as a sensitivity.

@iantaylor-NOAA
Copy link
Collaborator Author

Kicking off a bunch of sensitivities to run while we add more text to the document seems like a good idea.

Here's a proposal for a cleaned-up set of sensitivities which we can automate within run_sensitivities().
This set is based on looking through the sensitivities listed at the top of this issue as well as the various other issues tagged with a sensitivity label. Please feel free to add or subtract from the list.

I think grouping them by number (e.g. 101, 102, etc. for biology sensitivities) will make it easy to
group subsets of them for inclusion in figures and tables. I've started editing run_sensitivities() but it will be easy
to refine the set at any time.

biology and recruitment (100-series)

  • estimate a single M for both sexes with female prior
  • estimate a single M for both sexes with male prior
  • fix h at 0.7
  • fix female M at 0.3 and h at 0.7
  • increase sigmaR from 0.6 to 0.8
  • decrease sigmaR from 0.6 to 0.4

composition data (likely requires extra tuning step) (200-series)

  • add all additional ages as CAAL to the south model
  • add all additional ages as marginal to the south model
  • add all fishery-independent ages as CAAL in the south model
  • remove fishery-dependent ages from the north model
  • apply D-M tuning
  • use the combM+F option to ignore sex ratios among small fish
  • remove all unsexed fish

indices (300-series)

  • Change CA Rec index in recent years from onboardCPFV to CRFSPR
  • Change OR Rec index to onboard (CPFV ride along) index
  • remove each 1 index at a time
  • remove all fishery-dependent indices

selectivity (400-series)

  • fix the commercial FG selectivity to be asymptotic
  • add male offsets to selectivity (initially just Male_Scale within offset option 3)

@kellijohnson-NOAA
Copy link
Collaborator

@iantaylor-NOAA where are you at with this I asking because I have a lot of code in a hake script to do many of these.

iantaylor-NOAA added a commit that referenced this issue Jun 24, 2021
doesn't yet work for lambda changes or multiple parameter lines
@iantaylor-NOAA
Copy link
Collaborator Author

iantaylor-NOAA commented Jun 24, 2021

run_sensitivities() has been expanded in 68fe18e and ad1f957, so now it's creating all the folders and modifying the parameter lines for those sensitivities where a single parameter line needs to change.

> run_sensitivities(get_dir_ling("n", 17), type = "sens", numbers = c(102, 105:999))
creating models/2021.n.017.102_h0.7
creating models/2021.n.017.105_sigmaR0.4
creating models/2021.n.017.201_all_CAAL_ages
creating models/2021.n.017.202_all_marg_ages
creating models/2021.n.017.203_some_CAAL_ages
creating models/2021.n.017.204_no_fishery_ages
creating models/2021.n.017.205_DM
creating models/2021.n.017.206_combMF
creating models/2021.n.017.207_no_unsexed
creating models/2021.n.017.301_CA_CRFSPR_index
creating models/2021.n.017.302_OR_CPFV_index
creating models/2021.n.017.303_no_fishery_indices
creating models/2021.n.017.311_no_Comm_Trawl_index
creating models/2021.n.017.312_no_Comm_Fix_index
creating models/2021.n.017.313_no_Rec_WA_index
creating models/2021.n.017.314_no_Rec_OR_index
creating models/2021.n.017.315_no_Rec_CA_index
creating models/2021.n.017.316_no_Surv_TRI_index
creating models/2021.n.017.317_no_Surv_WCGBTS_index
creating models/2021.n.017.318_no_Surv_HookLine_index
creating models/2021.n.017.320_no_CPFV_DebWV_index
creating models/2021.n.017.401_asymptotic_FG
creating models/2021.n.017.402_male_sel_offset

Next steps would be

  • getting it to work for multiple parameter lines in one sensitivity (@kellijohnson-NOAA) [not worth the effort this year]
  • getting it to work for lambda changes to leave-one-out sensitivities (I'm happy to do that, shouldn't be hard)
  • getting it to work for changes to age comps (really just swapping data files, shouldn't be too hard) [faster to do by hand]
  • using r4ss::run_SS_models()

But is it worth investing the time now? If it takes a model 30 minutes to run and 30 seconds to manually edit the control file? Maybe what we've got now is good enough.

@iantaylor-NOAA
Copy link
Collaborator Author

In the interest of time, it's probably better to just run all the sensitivities with -nohess and then return to get the Hessian (potentially starting from the .par file) if there's particular interest in any of them. We can still use maximum gradient component as a diagnostic of convergence even without the check of a positive definite Hessian.

@kellijohnson-NOAA
Copy link
Collaborator

creating models/2021.n.017.102_h0.7 is already done in the profile on h

@iantaylor-NOAA
Copy link
Collaborator Author

Good point @kellijohnson-NOAA. I think a few of the age inclusion/exclusions are already in place as well.

North model takes about 20 minutes with -nohess -cbs 1500000000 and we have 26 sensitivities on the list, so by my math, that's < 9 hours to run them all and some have been run already. The south model will be faster and I can run at least 3 groups at once, so should have all results by the end of today. I think that makes more sense than trying to divide up among different people. Let me know if this doesn't make sense.

I can work on creating the summary figures and tables for groups of them after they finish and don't need to wait for the full set.
In the past I've created csv files for tables of sensitivity results like this https://github.com/iantaylor-NOAA/BigSkate_Doc/blob/master/txt_files/Sensitivities_bio.csv. Can someone point me to an example of how I should use kable instead?

@kellijohnson-NOAA
Copy link
Collaborator

Do you still need me to parse the ones with code in the column?
The csv is great, we will just use kableExtra::kbl() on the csv file. I can create a function table_sens that does it.

@iantaylor-NOAA
Copy link
Collaborator Author

Sounds good. Don't worry about parsing the extra column. For this year, I'll just take the 30 seconds to edit the files by hand.

@kellijohnson-NOAA
Copy link
Collaborator

Too late, I was already down the rabbit hole ...

mapply(function(x) eval(parse(text=gsub('\"', "", x))), sens_table[["INIT"]],SIMPLIFY = FALSE)

iantaylor-NOAA added a commit that referenced this issue Jun 25, 2021
@iantaylor-NOAA
Copy link
Collaborator Author

Took too long, but finally the automated sensitivity running is running for north and south models. I forgot to skip the redundant h0.7 case, but can kill it when it gets there and continue with the rest using type = "sens_run" only.

run_sensitivities(get_dir_ling("s", 14), type = c("sens_create", "sens_run"), numbers = 101:105)
creating models/2021.s.014.101_shareM
creating models/2021.s.014.102_h0.7
creating models/2021.s.014.103_M0.3_h0.7
creating models/2021.s.014.104_sigmaR0.8
creating models/2021.s.014.105_sigmaR0.4
writing models/2021.s.014.001_esth/sensitivities_25-06-2021_11.42.00.1034.csv
running model in models/2021.s.014.101_shareM
Assuming model is in each dirvec folder.
changing working directory to models/2021.s.014.101_shareM
Running model in directory: c:/ss/lingcod/lingcod_2021/models/2021.s.014.101_shareM
Using the command: ss.exe -nox -nohess -cbs 1500000000

@kellijohnson-NOAA
Copy link
Collaborator

It didn't take too long, you were faster than all of us who were doing nothing on it. Our future selves will ❤️ you for this. 👏

@iantaylor-NOAA
Copy link
Collaborator Author

Yes, for sure there is investment in our future selves. Lots of cleanup of the messy things, but easier to clean up than start from scratch.

Now running leave-one-out sensitivities index sensitivities for the south model (#86).
This has been on the r4ss wishlist since 2015: r4ss/r4ss#59.

> run_sensitivities(get_dir_ling("s", 14), type = c("sens_run", "sens_create"), numbers = c(303:320))
requested sensitivities filtered for area = 's':303, 311, 315, 316, 317, 318, 320
creating models/2021.s.014.303_no_fishery_indices
creating models/2021.s.014.311_no_Comm_Trawl_index
creating models/2021.s.014.315_no_Rec_CA_index
creating models/2021.s.014.316_no_Surv_TRI_index
creating models/2021.s.014.317_no_Surv_WCGBTS_index
creating models/2021.s.014.318_no_Surv_HookLine_index
creating models/2021.s.014.320_no_CPFV_DebWV_index
writing models/2021.s.014.001_esth/sensitivities_25-06-2021_12.26.28.4970.csv
running model in models/2021.s.014.303_no_fishery_indices
Assuming model is in each dirvec folder.
changing working directory to models/2021.s.014.303_no_fishery_indices
Running model in directory: c:/ss/lingcod/lingcod_2021/models/2021.s.014.303_no_fishery_indices
Using the command: ss.exe -nox -nohess -cbs 1500000000

@kellijohnson-NOAA
Copy link
Collaborator

#dedication

@iantaylor-NOAA
Copy link
Collaborator Author

Sensitivity-related functions are added in 2599f1f and e4ed549.
Multiple related functions are in the file R/sensitivity_output.R.

The first table of results is here:
https://github.com/iantaylor-NOAA/Lingcod_2021/blob/main/doc/sens_table_s_bio_rec.csv

It would be easy to change the units, labels, add or subtract rows, etc. within the function so if any of that seems useful, let me know so it applies to future tables, although we can obviously apply additional processing outside the new functions.

@kellijohnson-NOAA
Copy link
Collaborator

can you save the table in tables rather than doc?

@iantaylor-NOAA
Copy link
Collaborator Author

for sure, I didn't know that was a thing--much better, done in 9d04b2f and 0a06d02 (took two commits because I couldn't keep track long enough to move the table AND change the code for future tables on the first try)

@iantaylor-NOAA
Copy link
Collaborator Author

I just added a table of south index sensitivities. I think the likelihood may have gone up rather than down for the removal of the rec indices because I forgot to turn off the extraSD parameter. I'll look closer later. Also, the final table would theoretically include 2 more columns for index sensitivities on the list that I haven't run yet (301 and 302).

@kellijohnson-NOAA
Copy link
Collaborator

Comments on the resulting table would be helpful. I know the digits need to be rounded, and I will fix that. Lower log likelihoods are in red. I don't know if we need that, but it was easy to add so I did it.
image

@iantaylor-NOAA
Copy link
Collaborator Author

Looks perfect to me (except for rounding).
I like the red.

The M Male value under the share M should be the same as the M female = 0.278481. Even though we don't use parameter offsets, I just learned from Rick that fixing a non-offset male growth parameter = 0 results in a match with the female value, and confirmed this in the Report file for that model. I could fix that in the table-making code or you could do it at the document end--whichever you wish.

@kellijohnson-NOAA
Copy link
Collaborator

kellijohnson-NOAA commented Jun 26, 2021

I think it would be better if we fix numbers in the csv code and just fix formatting in the kable code

@iantaylor-NOAA
Copy link
Collaborator Author

will do, should I also round everything to 2 digits?

@kellijohnson-NOAA
Copy link
Collaborator

I already figured out the rounding thing in kable.

@iantaylor-NOAA
Copy link
Collaborator Author

The two sensitivity tables are update for the "share M" case and new plot associated with these sensitivities have been added in 37fc8da. Plots and associated csv files are in models/2021.s.014.001_esth/custom_plots rather than figures because they seem like a useful record to keep with the associated model around which the sensitivities were conducted, rather than replaced whenever we change a base model.

Plots are below, showing influence of the indices that cover the earlier time period for the south model.
sens_timeseries_s_index
sens_timeseries_s_bio_rec

@iantaylor-NOAA
Copy link
Collaborator Author

Third (and last for this draft) set of south model sensitivities added in 856ff19.
Results (fig below) serve as a reminder of why we aren't using DM likelihood or fishery ages in the south model. If I have time tomorrow I will run `r4ss::SS_tune_comps()' on all but the DM model and replace the fig.
sens_timeseries_s_comp

@kellijohnson-NOAA
Copy link
Collaborator

Owen just commented that the sensitivity with M=.3 and h =.7 doesn't have M at .3 ... @iantaylor-NOAA can you check this in the files?
image

@iantaylor-NOAA
Copy link
Collaborator Author

Fix for M=.3 and h =.7 sensitivity added in d02c0d2. Models will run while we sleep (including others for the north) and I'll post fixed results in the morning.

@iantaylor-NOAA
Copy link
Collaborator Author

Sensitivity results are updated to fix an issue with the DM and add the remaining ones for the north (low-tech numbering scheme now puts it at the end of the list).
We can run additional sensitivities before the review, but this is all I can manage for now.

@iantaylor-NOAA
Copy link
Collaborator Author

Sensitivities are complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sensitivity w/ average priority ideas that could be sensitivity analyses sensitivity w/ low priority ideas that are on the wishlist but are lower priority topic: model Pertains to an SS model run
Projects
None yet
Development

No branches or pull requests

2 participants