Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report autocorrelation for recruitment devs and selectivity deviations #65

Closed
k-doering-NOAA opened this issue Nov 5, 2020 · 13 comments
Assignees
Labels
wishlist request new feature; bigger than revision; OK to remove after adding to Milestone

Comments

@k-doering-NOAA
Copy link
Contributor

Imported from redmine, Issue #73992
Opened by @iantaylor-NOAA on 2020-01-29
Status when imported: Resolved

As discussed at 29 January 2020 meeting, it would be useful to report autocorrelation for recruitment deviations and semi-parametric (2D-AR) selectivity deviations.

@k-doering-NOAA k-doering-NOAA added the wishlist request new feature; bigger than revision; OK to remove after adding to Milestone label Nov 5, 2020
@k-doering-NOAA
Copy link
Contributor Author

comment from @k-doering-NOAA on 2020-06-25:
@kelli.johnson, do you still have the code Ian references in this issue? Would it be useful to attach it? It seems like this issue could be worked without it, though, unless there are additional quantities that SS needs to report...

@k-doering-NOAA
Copy link
Contributor Author

comment from neal.schindler on 2020-07-21:
Kelli, I'm in the dark on this, would you please attach the code?
Otherwise, I need someone to explain what this even is. (I might need that
anyway!)

Thanks

On Thu, Jun 25, 2020 at 2:31 PM vlab.redmine@noaa.gov wrote:

@k-doering-NOAA
Copy link
Contributor Author

comment from @RickMethot on 2020-07-23:
I was thinking of adding parm_dev autocorr to this section of code that is already calculating the mean and root mean squared error (rmse) of each parm_dev vector:
if(pick_report_use(4)=="Y")
{
SS2out<<endl<<pick_report_name(4)<<endl;
SS2out<<"Index Phase MinYr MaxYr stddev Rho Like_devs Like_se mean rmse"<<endl;
for(i=1;i<=N_parm_dev;i++)
{
SS2out<<i<<" "<<parm_dev_PH(i)<<" "<<parm_dev_minyr(i)<<" "<<parm_dev_maxyr(i)<<" "<<parm_dev_stddev(i)<<" "<<
parm_dev_rho(i)<<" "<<parm_dev_like(i)<<" "<<
sum(parm_dev(i))/float(parm_dev_maxyr(i)-parm_dev_minyr(i)+1.)<<" "<<
sqrt(sumsq(parm_dev(i)+1.0e-9)/float(parm_dev_maxyr(i)-parm_dev_minyr(i)+1.))<<endl;
}

the calculation of autocorrelation with a lag of 1 year would modify:
sqrt(sumsq(parm_dev(i)+1.0e-9)/float(parm_dev_maxyr(i)-parm_dev_minyr(i)+1.))<<endl;

to something like
step 1: create a temp_vector that is parm_dev(i,j) lagged by 1 time step. so temp(j)=parm(i,j+1) for j=parm_dev_minyr(i) to parm_dev_maxyr(i)-1.
step 2: calculate sum of the lagged deviations as elem_prod( parm_dev(i)(parm_dev_minyr(i),parm_dev_maxyr(i)-1.) , temp(same range of index) ).

@k-doering-NOAA
Copy link
Contributor Author

comment from @k-doering-NOAA on 2020-07-30:
I have edited the description for this issue to take Haikun's code out of the mix - getting Haikun's code into r4ss is something Kelli will work on and it isn't relevant to the issue at hand. All that needs to be done for this isssue (to my understanding) is to "report autocorrelation for recruitment deviations and semi-parametric (2D-AR) selectivity deviations". link to ian's model.

the calculation of autocorrelation with a lag of 1 year would modify:
sqrt(sumsq(parm_dev(i)+1.0e-9)/float(parm_dev_maxyr(i)-parm_dev_minyr(i)+1.))<<endl;

to something like
step 1: create a temp_vector that is parm_dev(i,j) lagged by 1 time step. so temp(j)=parm(i,j+1) for j=parm_dev_minyr(i) to parm_dev_maxyr(i)-1.
step 2: calculate sum of the lagged deviations as elem_prod( parm_dev(i)(parm_dev_minyr(i),parm_dev_maxyr(i)-1.) , temp(same range of index) ).

@richard.methot, the question I'm unclear on is do you just want SS to report autocorrelation values with only a lag of 1? Or should something more extensive, like the "acf function in R":https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/acf be done (acf calculates the autocorrelation at any lag up to 1 less than the number of observations)? It sounds like in your description above, just reporting autocorrelation with lag 1 should be sufficient, but I wanted to check.

In case it is helpful, "this tutorial":https://www.datacamp.com/community/tutorials/autocorrelation-r talks about the acf function in R, but also about why you would want to calculate autocorrelation.

Also, where should the autocorrelation values for rec devs and (if they exist) for AR selectivity devs be reported? I assume somewhere in the Report file?

@k-doering-NOAA
Copy link
Contributor Author

comment from neal.schindler on 2020-09-15:
I have incorporated the following code into the ss.exe attached to this comment. Please test it.  
for(i=1;i<=N_parm_dev;i++)
    {
      dvar_vector temp_autocorr_vec(parm_dev_minyr(i),parm_dev_maxyr(i)-1.);
      dvar_vector temp_autocorr(parm_dev_minyr(i),parm_dev_maxyr(i)-1.);
      temp_autocorr_vec.initialize();
      temp_autocorr.initialize();
      for (int j=parm_dev_minyr(i);j<parm_dev_maxyr(i);j++)
        temp_autocorr_vec(j)=parm_dev(i,j+1);
      temp_autocorr=elem_prod(parm_dev(i)(parm_dev_minyr(i),parm_dev_maxyr(i)-1.), temp_autocorr_vec(parm_dev_minyr(i),parm_dev_maxyr(i)-1.));
      SS2out<<i<<" "<<parm_dev_PH(i)<<" "<<parm_dev_minyr(i)<<" "<<parm_dev_maxyr(i)<<" "<<parm_dev_stddev(i)<<" "<<
      parm_dev_rho(i)<<" "<<parm_dev_like(i)<<" "<<
      sum(parm_dev(i))/float(parm_dev_maxyr(i)-parm_dev_minyr(i)+1.)<<" "<<
      temp_autocorr<<endl;
    }

@k-doering-NOAA
Copy link
Contributor Author

comment from neal.schindler on 2020-09-17:
The output for the attached model is this:
Parm_devs_detail report:4
Index Phase MinYr MaxYr stddev Rho Like_devs Like_se mean rmse
1 4 1972 2021 13 0 0.110312 0 0.0465018 0.00730064 0.0025674 0.000143538 -0.000572001 -0.000810501 0.00150115 0.00212718 0.000123532 9.49443e-005 0.00132976 0.00969544 -0.00467016 -0.00128357 0.000985198 0.00358334 0.00877272 0.00790381 0.00262727 0.00388036 0.0083304 0.00430404 0.00539481 -0.0045459 -0.00254027 0.00596679 0.00617174 0.00293049 0.00720249 0.00701234 0.00323496 0.000960713 0.000803009 0.00181027 0.00202123 0.00162128 0.00143333 0.00548851 -0.00422762 -0.000964375 0.00127806 0.00086093 0.000282454 0.000253002 0.000524551 0.000347076 1.2438e-005 0.000122499 0.00493683 0.00203228
2 4 1972 2021 13 0 0.0291132 0 0.00423782 0.000185037 2.72776e-006 3.64698e-007 -0.000120687 0.000220065 -4.04731e-005 -0.000167846 0.000116622 3.85119e-005 0.000110056 -0.000224464 -0.00109312 0.00194002 0.00169865 0.000718416 -3.3007e-005 2.30352e-005 -7.49298e-006 1.13195e-005 0.000211233 -3.02761e-005 -1.54668e-005 0.0014855 -0.000500485 8.32937e-005 0.000228949 0.000242568 7.1675e-005 7.7379e-005 0.000537395 0.000735039 0.000968719 0.000559809 0.000307525 0.000701608 0.000632843 -0.000220502 0.000487867 -0.0011195 0.000591394 0.000280009 -0.000182833 0.000134046 -1.00675e-005 -5.9119e-006 5.88778e-005 -0.000152201 0.000208812 0.000297337

@k-doering-NOAA
Copy link
Contributor Author

comment from @RickMethot on 2020-09-17:
I have revised the code to this:
int y1=parm_dev_minyr(i);
int y2=parm_dev_maxyr(i);
double count;
count=float(y2-y1+1.);
double mean;
mean=value(sum(parm_dev(i))/count);
for (j=y1+1;j<=y2;j++) {for_AR1(j) = value(parm_dev(i,j-1))-mean;}
for_var = value(parm_dev(i))-mean;
SS2out<<"orig: "<<parm_dev(i)<<endl<<"-mean: "<<for_var<<endl;
double cross;
double Durbin;
double var;
var=sumsq(for_var);
cross=0.;
Durbin=0;
for(j=y1+1;j<=y2;j++)
{
cross+=for_var(j)*for_AR1(j);
Durbin+=square(for_var(j)-for_AR1(j));
}
cross/=(count-1.);
Durbin/=(var+1.0e-09);
var/=count;
SS2out<<i<<" "<<parm_dev_PH(i)<<" "<<y1<<" "<<y2<<" "<<count<<" "<<parm_dev_stddev(i)<<" "<<
parm_dev_rho(i)<<" "<<parm_dev_like(i)<<" "<<
sum(parm_dev(i))/count<<" "<<sqrt(sumsq(parm_dev(i))/(count))<<" "<<var<<" "<<sqrt(var)<<" "<<cross/var<<" "<<Durbin<<" "<<endl;

And it produces this:
Index Phase MinYr MaxYr N stddev Rho Like_devs Like_se mean rmse var sqrt(var) est_rho D-W
1 4 1972 2021 50 13 0 0.110312 0 0.0465018 0.0664264 0.00225005 0.0474347 0.0221286 1.95646
orig: 0.0499888 0.146046 0.0175794 0.00816514 -0.0700541 0.0115696 0.129749 0.0163945 0.00753493 0.0126006 0.105532 0.0918722 -0.0508332 0.0252506 0.0390168 0.091841 0.0955207 0.0827445 0.0317516 0.12221 0.0681647 0.0631418 0.0854395 -0.0532061 0.0477439 0.124975 0.0493838 0.0593412 0.121374 0.0577745 0.0559929 0.0171578 0.0468015 0.0386797 0.0522556 0.031026 0.0461976 0.118805 -0.0355845 0.027101 0.047159 0.0182559 0.0154719 0.0163523 0.0320781 0.0108197 0.00114956 0.106562 0.0463284 0.0438668
-mean: 0.003487 0.0995438 -0.0289223 -0.0383366 -0.116556 -0.0349321 0.0832474 -0.0301072 -0.0389668 -0.0339012 0.05903 0.0453705 -0.097335 -0.0212511 -0.007485 0.0453392 0.049019 0.0362427 -0.0147502 0.0757082 0.0216629 0.0166401 0.0389378 -0.0997079 0.00124213 0.0784732 0.00288204 0.0128394 0.0748724 0.0112728 0.00949114 -0.029344 0.000299763 -0.00782208 0.00575388 -0.0154758 -0.000304169 0.0723033 -0.0820862 -0.0194007 0.000657266 -0.0282459 -0.0310298 -0.0301495 -0.0144236 -0.035682 -0.0453522 0.0600598 -0.000173357 -0.00263498
Can you put the "-mean" vector into R and check on my calculation of var, est_rho and if R has it, the Durbin-Watson statistic?

@k-doering-NOAA
Copy link
Contributor Author

comment from @RickMethot on 2020-09-18:
I mean: check against standard R functions like acf

@k-doering-NOAA
Copy link
Contributor Author

comment from @k-doering-NOAA on 2020-09-18:
@richard.methot, calcs of est_rho and the Durbin-Watson statistic look right. Here's what I did in R to check:

> mean_vec <- "0.003487 0.0995438 -0.0289223 -0.0383366 -0.116556 -0.0349321 0.0832474 -0.0301072 -0.0389668 -0.0339012 0.05903 0.0453705 -0.097335 -0.0212511 -0.007485 0.0453392 0.049019 0.0362427 -0.0147502 0.0757082 0.0216629 0.0166401 0.0389378 -0.0997079 0.00124213 0.0784732 0.00288204 0.0128394 0.0748724 0.0112728 0.00949114 -0.029344 0.000299763 -0.00782208 0.00575388 -0.0154758 -0.000304169 0.0723033 -0.0820862 -0.0194007 0.000657266 -0.0282459 -0.0310298 -0.0301495 -0.0144236 -0.035682 -0.0453522 0.0600598 -0.000173357 -0.00263498"
> 
> mean_vec <- as.numeric(strsplit(mean_vec, split = " ")[[1]])
> 
> mean_vec
 [1]  0.003487000  0.099543800 -0.028922300 -0.038336600
 [5] -0.116556000 -0.034932100  0.083247400 -0.030107200
 [9] -0.038966800 -0.033901200  0.059030000  0.045370500
[13] -0.097335000 -0.021251100 -0.007485000  0.045339200
[17]  0.049019000  0.036242700 -0.014750200  0.075708200
[21]  0.021662900  0.016640100  0.038937800 -0.099707900
[25]  0.001242130  0.078473200  0.002882040  0.012839400
[29]  0.074872400  0.011272800  0.009491140 -0.029344000
[33]  0.000299763 -0.007822080  0.005753880 -0.015475800
[37] -0.000304169  0.072303300 -0.082086200 -0.019400700
[41]  0.000657266 -0.028245900 -0.031029800 -0.030149500
[45] -0.014423600 -0.035682000 -0.045352200  0.060059800
[49] -0.000173357 -0.002634980
> 
> acf(mean_vec, lag.max = 1, plot = FALSE) #get est_rho by looking at lag 1

Autocorrelations of series ‘mean_vec’, by lag

    0     1 
1.000 0.022 
> 
> car::durbinWatsonTest(model = mean_vec) # get D-W statistic
[1] 1.956458

Let me know if there is anything else you wanted me to check!

@k-doering-NOAA
Copy link
Contributor Author

comment from @RickMethot on 2020-09-21:
perfect. Thank you

@k-doering-NOAA
Copy link
Contributor Author

comment from @RickMethot on 2020-09-21:
In Spawn-Recruit, output is:
ERA N RMSE RMSE^2/sigmaR^2 mean_BiasAdj est_rho Durbin-Watson
main 51 0.499639 0.693441 1 0.117993 1.75467
early 10 0.362451 0.364919 1

In parm_dev_details:
Parm_devs_detail report:4
Index Phase MinYr MaxYr N stddev Rho Like_devs Like_se mean rmse var sqrt(var) est_rho D-W
1 4 1972 2021 50 13 0 0.110312 0 0.0465018 0.0664264 0.00225005 0.0474347 0.0221286 1.95646
2 4 1972 2021 50 13 0 0.0291132 0 0.00423782 0.0341252 0.00114657 0.033861 0.159893 1.67088

@Rick-Methot-NOAA Rick-Methot-NOAA added needs testing issue has been resolved and needs testing resolved issue resolved, look for "needs test" label labels Jan 27, 2021
@Rick-Methot-NOAA
Copy link
Collaborator

I think this is resolved. Chantel, please take a look. @chantelwetzel-noaa

@k-doering-NOAA
Copy link
Contributor Author

I also think this is resolved, but maybe just needs documentation (opened nmfs-ost/ss3-doc#17 just in case)

closing, but feel free to reopen if it seems it needs more than just documentation.

@k-doering-NOAA k-doering-NOAA removed the needs testing issue has been resolved and needs testing label May 19, 2021
@k-doering-NOAA k-doering-NOAA removed the resolved issue resolved, look for "needs test" label label Jun 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wishlist request new feature; bigger than revision; OK to remove after adding to Milestone
Projects
None yet
Development

No branches or pull requests

2 participants