etherington-lieske #52

tretherington · 2018-08-30T23:06:32Z

AUTHOR

Dear @ReScience/editors,

I request a review for the following replication:

Original article

Title: Resampling methods for evaluating classification accuracy of wildlife habitat models
Author(s): Verbyla DL, Litvaitis JA
Journal (or Conference): Environmental Management
Year: 1989
DOI: https://doi.org/10.1007/BF01868317
PDF: https://www.researchgate.net/profile/John_Litvaitis/publication/226300610_Resampling_methods_for_evaluating_class_accuracy_of_wildlife_habitat_models/links/53d4fd790cf2a7fbb2ea2b1d/Resampling-methods-for-evaluating-class-accuracy-of-wildlife-habitat-models.pdf?origin=publication_detail

Replication

Author(s): @tretherington and David Lieske
Repository: https://github.com/tretherington/ReScience-submission/tree/etherington-lieske
PDF: https://github.com/tretherington/ReScience-submission/blob/etherington-lieske/article/etherington-lieske-2018.pdf (Apologies, but really struggling to get pandoc cross-ref working, but hopefully this pdf will suffice to start the process)
Keywords: ecology, wildlife, habitat, model
Language: Python
Domain: Ecology

Results

Article has been fully replicated
Article has been partially replicated
Article has not been replicated

Potential reviewers

Based on an 'ecology' and 'Python' combination, @tpoisot and @jsta may be good potential reviewers.

EDITOR

Editor acknowledgment - @tpoisot
Reviewer 1 - @jkitzes accepted 2018-10-31 🎃
Reviewer 2 - @laurajanegraham accepted 2019-01-30
Review 1 decision [accept/reject]
Review 2 decision [accept/reject]
Editor decision [accept/reject]

Python code that generates results and figure

rougier · 2018-09-05T07:16:29Z

@tretherington Thank you for you submission. An editor will be assigned soon.

rougier · 2018-09-05T07:16:47Z

@tpoisot Can you edit this submission

rougier · 2018-09-08T09:46:39Z

@tpoisot 👏

rougier · 2018-09-14T07:24:51Z

@tretherington I'm not forgeting you ! I think @tpoisot won't be available before 10/10. I'm looking for another editor.

@karthik Could you handle this submission ?

karthik · 2018-09-14T17:17:05Z

Hi @rougier I am on continuous travel through middle of October so I would recommend another editor for this submission.

rougier · 2018-09-18T07:51:19Z

@karthik Ok, thanks.
@dmcglinn could you edit this submission?

dmcglinn · 2018-09-30T05:22:25Z

Hey @rougier, sorry but I cannot help with this submission as an editor. I'm over committed this month.

rougier · 2018-10-06T07:58:53Z

@tretherington As you can see above, we've some problems finding the proper editor but I think @tpoisot will be available in ten days. If it's ok with you we can wait for some more days or I can send a call for edit to all reviewers. What do you think ? Or maybe @khinsen knows someone ?

khinsen · 2018-10-06T10:13:28Z

@rougier No, sorry, I don't know anyone in ecology other than the people who are already on our editor/reviewer list!

dmcglinn · 2018-10-07T12:29:05Z

Here are some other potential ecology folks that may be able to help with this @ethanwhite @ahhurlbert @jarioksa @gavinsimpson @dschwilk just to name a few that I know would all be great as reviewers or editors.

tretherington · 2018-10-07T19:14:59Z

Hi @rougier no worries about the delay, all quite understandable. Happy to wait for a bit longer if you think a good option is on the horizon. :)

rougier · 2018-10-13T16:55:30Z

@tpoisot Do you have more time now to edit this submission?

rougier · 2018-10-19T07:29:02Z

@tpoisot 🛎

rougier · 2018-10-30T09:54:14Z

@tpoisot @karthik Could you edit this submission (we are very late now)

tpoisot · 2018-10-30T12:30:06Z

On it. I'll invite reviewers as soon as I reach my office.

tpoisot · 2018-10-30T13:34:52Z

Reviewer invitations

@gvdr
@jkitzes
@emchristensen

Would one of you be available to review this article? I can walk you through the review process if needed.

gavinsimpson · 2018-10-30T18:12:37Z

Sorry @dmcglinn I'm just catching up with various requests through Github - I can't review this just now as I'm over committed at the moment (& python is not really my wheel house).

jkitzes · 2018-10-31T12:31:42Z

@tpoisot I believe I can get to this - when do you need it by?

tpoisot · 2018-10-31T13:51:59Z

@jkitzes 3 weeks?

jkitzes · 2018-10-31T14:17:37Z

@tpoisot Sure, I can do that. Anything to know about the review process other than what's in the reviewer guidelines?

tpoisot · 2018-11-07T13:55:35Z

Ping @gvdr and @emchristensen

emchristensen · 2018-11-07T23:14:08Z

@tpoisot I'm sorry but I"m not going to be able to take this on at the moment (and I really don't know much about python)

jkitzes · 2018-11-18T01:19:12Z

GENERAL COMMENTS

In this manuscript, the authors create a up-to-date Python implementation of an analysis of several different resampling methods, originally published by Verbyla and Litvaitis. They verify the authors' original conclusion, which is that resubstitution produces biased estimates of classifier error, while several other procedures (including cross-validation, jacknifing, and bootstrapping) produce unbiased estimates. This test is done in the context of a species distribution model, but it applies more generally as well.

I found the manuscript clear and the results reasonable. I would consider the original study to be successfully replicated, while noting that this new implementation actually goes beyond the original analysis and extends its findings in some respects. The code in this PR actually does not run on my system, but with the two modifications below, it runs successfully and reproduces the key figure.

CODE COMMENTS

While I did not go through the code line-by-line to verify its accuracy, the code is written clearly and understandably overall. The implementation appears to follow what the authors describe in the manuscript. I did encounter two issues that needed to be fixed, however, at least on my system:

First, there appears to be one major bug. In the function peLDA, it appears that the function forgets to actually create an LDA object that has the methods fit and predict available. Changing this function to

# Define a function to calculate prediction errors from LDA model
def peLDA(trainPA, trainEV, testPA, testEV):
    # Create and fit model
    LDA_obj = LDA()
    LDA_obj.fit(trainEV, trainPA)
    # Predict against test data
    predicts = LDA_obj.predict(testEV)
    # Calculate and return the prediction errors
    return(np.abs(testPA - predicts))

will correct the problem.

Second, I did not have LaTeX installed locally on the machine where I tested this code, and the matplotlib figure will not save (throws an error) due to the line plt.rc('text', usetex=True). Changing this to False allows the figure to save - the text is not rendered as nicely, of course, but everything is still in the right place. I would suggest either setting this to False by default or noting that LaTeX is a dependency.

ARTICLE COMMENTS

A few fairly minor comments on the text:

The authors appear to consistently mis-spell the second author's name in the paper they are replicating - it should be "Litvaitis" not "Litvatis".
There's a vertical bar missing from Eq. 3.
I think it would help to link the terminology here back explicitly to that used by Verbyla and Litvaitis. It's fairly obvious for someone used to looking at this, but it couldn't hurt to say things like "Verbyla and Litvaitis use 'Cross-validation' to refer to Hold-out cross-validation performed for one repitition", and so on for the other methods.
While it's logical and useful to test three types of hold-out validation here, I would note that there does not appear to me to be any ambiguity about which case Verbyla and Litvaitis were envisioning - they state under "Cross-validation" that "only one estimate of accuracy is made", which to me indicates that they used H=1. This would also somewhat explain their general statement about the imprecision of the estimate of classification accuracy.
I would suggest reordering the description of approaches in either the "Resampling methods" or "Computational experiment replication" section so that they occur in the same order in both.
I would suggest stating, just for clarity, that the replication here used a LDA for prediction as well.

tretherington · 2019-03-21T00:35:37Z

@tpoisot article correctly typeset now when you have some time to review

tretherington · 2019-03-21T01:05:06Z

Hi @dimpase thanks for providing another test of the code 😃! I've added a comments about needing a dvipng installation in the README.

laurajanegraham · 2019-03-21T10:50:06Z

@tretherington Ah yes, I see that in the README now. That's absolutely fine then; nothing else that I would suggest. Also, not sure why I had such an out of date version of numpy!

@tpoisot I recommend this for acceptance

tpoisot · 2019-03-21T12:00:21Z

@rougier following the two positive reviews, I'm happy to recommend acceptance of this article. I'm rusty on the next steps (and @tretherington will need to resolve the issues with the Makefile first anyways).

rougier · 2019-03-21T16:50:07Z

tretherington · 2019-03-21T19:14:12Z

@tpoisot thanks for finding time for this. All should be good with the Makefile and pdf now, but do let me know if you need anything else from me.

tretherington · 2019-04-16T23:02:52Z

Hi @tpoisot just checking if you are waiting on me for something - I think I've done everything, but please let me know if I've missed something

tpoisot · 2019-04-17T12:59:54Z

@tretherington there are still conflicts with the Makefile -- when this is solved, @ReScience/editors will be able to assign an article number.

tretherington · 2019-04-17T20:52:54Z

@tpoisot Aha, yes, same issue corrected in both mine and master Makefile, sorry should have spotted and realised that was an issue. All good now I think.

tretherington · 2019-05-14T01:47:55Z

Hi @tpoisot, any chance this could be progressed? I appreciate you will be busy, but I have an end of project deadline coming up, and it would be great to have a published version of this for reporting purposes. I will help in whatever way I can!

tpoisot · 2019-05-14T21:57:44Z

I missed the message with infos from @rougier -- I will go through the upload and publication on friday

rougier · 2019-05-23T07:50:22Z

@tretherington @tpoisot I just put the new website online such that the publication process should be easier. I can help on that. What I need at this point is the PDF of the article and a metadata file following this model: https://github.com/ReScience/template/blob/master/metadata.yaml.

@tretherington If you can fill in the metadata and give it back to me, I can probably publish it today. If you want to use the new article template design, it might necessitate a bit more work (from you) but we can also do it.

tretherington · 2019-05-24T00:24:25Z

@rougier I like the look of the new publication process. I think basing things on LaTeX will smooth the process out significantly - I know I spent a lot of time trying to get pandoc working properly, and my paper was first written in LaTeX so I'm back to where I started!

I wasn't sure how best to create/submit the new PDF without (a) breaking the link to the review here, or (b) duplicating the paper by submitting again. So I'm hoping you might be able to figure out the best way to blend my submission under the old system into a paper in the new system. To help you do that I've created a new folder in my pull request that contains the metadata.yaml and content.tex needed to generate the PDF which I have also done (new paper layout/style looks good!).

I think the only things outstanding are some urls and dois for the metadata, but as I've said already, I was hoping you might be able to fill those in as you will understand better how to mesh the two systems.

Hope that is all OK, and please just let me know if you need anything else.

rougier · 2019-05-24T06:27:53Z

Perfect ! I will get the DOI from Zenodo and rebuild your PDF. Last thing is I need a code DOI (in case your repo disspaear sometime in the future). Can you deposit it on Zenodo and give me back the DOI?

tretherington · 2019-05-26T21:14:43Z

@rougier many thanks!

I have deposited the repository on Zenodo with DOI: 10.5281/zenodo.3229408

rougier · 2019-05-27T20:33:18Z

Can you fill editor and reviewer name/orcid ? volume is 5, issue is 1 (and also, can you fill the code URL and DOI).

submission date + publication date (28/05/2019)

tretherington · 2019-05-27T21:13:54Z

Hi @rougier I think that is all the metadata expect the article number, doi, and url

rougier · 2019-05-28T14:23:30Z

Can you check if https://sandbox.zenodo.org/record/294119 seems correct ?
(This is not the real DOI because this is Zenodo sandbox)

tretherington · 2019-05-28T21:21:30Z

@rougier the article pdf looks correct to me!

rougier · 2019-05-29T10:03:33Z

Done! https://rescience.github.io/read/#volume-5-2019

tretherington · 2019-05-29T20:52:09Z

Thanks for all your help @rougier ! Having now had some of my work replicated and having replicated someone else's work in ReScience, I've become a big fan of the journal.

tretherington added 3 commits August 28, 2018 14:23

Code added

a8c9103

Python code that generates results and figure

Removed redundant data and notebook folders

e0eac4f

Updated/replaced examples files with manuscript files

3880133

rougier added 01 - Request Python Life Science labels Oct 19, 2018

rougier assigned tpoisot Nov 2, 2018

Typeset pdf

7f2b72c

Update README.md

6039e38

Merge branch 'master' into etherington-lieske

4bc02f7

Created new latex based article

5bad52f

Updated metadata.yaml

1a83b9a

rougier closed this May 30, 2019

ReScience locked as resolved and limited conversation to collaborators May 30, 2019

rougier added 03 - Accepted 04 - Published labels May 30, 2019

etherington-lieske #52

etherington-lieske #52

Uh oh!

Conversation

tretherington commented Aug 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Original article

Replication

Results

Potential reviewers

Uh oh!

rougier commented Sep 5, 2018

Uh oh!

rougier commented Sep 5, 2018

Uh oh!

rougier commented Sep 8, 2018

Uh oh!

rougier commented Sep 14, 2018

Uh oh!

karthik commented Sep 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rougier commented Sep 18, 2018

Uh oh!

dmcglinn commented Sep 30, 2018

Uh oh!

rougier commented Oct 6, 2018

Uh oh!

khinsen commented Oct 6, 2018

Uh oh!

dmcglinn commented Oct 7, 2018

Uh oh!

tretherington commented Oct 7, 2018

Uh oh!

rougier commented Oct 13, 2018

Uh oh!

rougier commented Oct 19, 2018

Uh oh!

rougier commented Oct 30, 2018

Uh oh!

tpoisot commented Oct 30, 2018

Uh oh!

tpoisot commented Oct 30, 2018

Uh oh!

gavinsimpson commented Oct 30, 2018

Uh oh!

jkitzes commented Oct 31, 2018

Uh oh!

tpoisot commented Oct 31, 2018

Uh oh!

jkitzes commented Oct 31, 2018

Uh oh!

tpoisot commented Nov 7, 2018

Uh oh!

emchristensen commented Nov 7, 2018

Uh oh!

jkitzes commented Nov 18, 2018

Uh oh!

tretherington commented Mar 21, 2019

Uh oh!

tretherington commented Mar 21, 2019

Uh oh!

laurajanegraham commented Mar 21, 2019

Uh oh!

tpoisot commented Mar 21, 2019

Uh oh!

rougier commented Mar 21, 2019

Uh oh!

tretherington commented Mar 21, 2019

Uh oh!

tretherington commented Apr 16, 2019

Uh oh!

tpoisot commented Apr 17, 2019

Uh oh!

tretherington commented Apr 17, 2019

Uh oh!

tretherington commented May 14, 2019

Uh oh!

tpoisot commented May 14, 2019

Uh oh!

tretherington commented Aug 30, 2018 •

edited

Loading

karthik commented Sep 14, 2018 •

edited

Loading

rougier commented May 27, 2019 •

edited

Loading