Skip to content

Conversation

tretherington
Copy link

@tretherington tretherington commented Aug 30, 2018

AUTHOR

Dear @ReScience/editors,

I request a review for the following replication:

Original article

Title: Resampling methods for evaluating classification accuracy of wildlife habitat models
Author(s): Verbyla DL, Litvaitis JA
Journal (or Conference): Environmental Management
Year: 1989
DOI: https://doi.org/10.1007/BF01868317
PDF: https://www.researchgate.net/profile/John_Litvaitis/publication/226300610_Resampling_methods_for_evaluating_class_accuracy_of_wildlife_habitat_models/links/53d4fd790cf2a7fbb2ea2b1d/Resampling-methods-for-evaluating-class-accuracy-of-wildlife-habitat-models.pdf?origin=publication_detail

Replication

Author(s): @tretherington and David Lieske
Repository: https://github.com/tretherington/ReScience-submission/tree/etherington-lieske
PDF: https://github.com/tretherington/ReScience-submission/blob/etherington-lieske/article/etherington-lieske-2018.pdf (Apologies, but really struggling to get pandoc cross-ref working, but hopefully this pdf will suffice to start the process)
Keywords: ecology, wildlife, habitat, model
Language: Python
Domain: Ecology

Results

  • Article has been fully replicated
  • Article has been partially replicated
  • Article has not been replicated

Potential reviewers

Based on an 'ecology' and 'Python' combination, @tpoisot and @jsta may be good potential reviewers.


EDITOR

  • Editor acknowledgment - @tpoisot
  • Reviewer 1 - @jkitzes accepted 2018-10-31 🎃
  • Reviewer 2 - @laurajanegraham accepted 2019-01-30
  • Review 1 decision [accept/reject]
  • Review 2 decision [accept/reject]
  • Editor decision [accept/reject]

@rougier
Copy link
Member

rougier commented Sep 5, 2018

@tretherington Thank you for you submission. An editor will be assigned soon.

@rougier
Copy link
Member

rougier commented Sep 5, 2018

@tpoisot Can you edit this submission

@rougier
Copy link
Member

rougier commented Sep 8, 2018

@tpoisot 👏

@rougier
Copy link
Member

rougier commented Sep 14, 2018

@tretherington I'm not forgeting you ! I think @tpoisot won't be available before 10/10. I'm looking for another editor.

@karthik Could you handle this submission ?

@karthik
Copy link
Member

karthik commented Sep 14, 2018

Hi @rougier I am on continuous travel through middle of October so I would recommend another editor for this submission.

@rougier
Copy link
Member

rougier commented Sep 18, 2018

@karthik Ok, thanks.
@dmcglinn could you edit this submission?

@dmcglinn
Copy link

Hey @rougier, sorry but I cannot help with this submission as an editor. I'm over committed this month.

@rougier
Copy link
Member

rougier commented Oct 6, 2018

@tretherington As you can see above, we've some problems finding the proper editor but I think @tpoisot will be available in ten days. If it's ok with you we can wait for some more days or I can send a call for edit to all reviewers. What do you think ? Or maybe @khinsen knows someone ?

@khinsen
Copy link
Contributor

khinsen commented Oct 6, 2018

@rougier No, sorry, I don't know anyone in ecology other than the people who are already on our editor/reviewer list!

@dmcglinn
Copy link

dmcglinn commented Oct 7, 2018

Here are some other potential ecology folks that may be able to help with this @ethanwhite @ahhurlbert @jarioksa @gavinsimpson @dschwilk just to name a few that I know would all be great as reviewers or editors.

@tretherington
Copy link
Author

Hi @rougier no worries about the delay, all quite understandable. Happy to wait for a bit longer if you think a good option is on the horizon. :)

@rougier
Copy link
Member

rougier commented Oct 13, 2018

@tpoisot Do you have more time now to edit this submission?

@rougier
Copy link
Member

rougier commented Oct 19, 2018

@tpoisot 🛎

@rougier
Copy link
Member

rougier commented Oct 30, 2018

@tpoisot @karthik Could you edit this submission (we are very late now)

@tpoisot
Copy link

tpoisot commented Oct 30, 2018

On it. I'll invite reviewers as soon as I reach my office.

@tpoisot
Copy link

tpoisot commented Oct 30, 2018

Reviewer invitations

@gvdr
@jkitzes
@emchristensen

Would one of you be available to review this article? I can walk you through the review process if needed.

@gavinsimpson
Copy link

Sorry @dmcglinn I'm just catching up with various requests through Github - I can't review this just now as I'm over committed at the moment (& python is not really my wheel house).

@jkitzes
Copy link

jkitzes commented Oct 31, 2018

@tpoisot I believe I can get to this - when do you need it by?

@tpoisot
Copy link

tpoisot commented Oct 31, 2018

@jkitzes 3 weeks?

@jkitzes
Copy link

jkitzes commented Oct 31, 2018

@tpoisot Sure, I can do that. Anything to know about the review process other than what's in the reviewer guidelines?

@tpoisot
Copy link

tpoisot commented Nov 7, 2018

Ping @gvdr and @emchristensen

@emchristensen
Copy link

@tpoisot I'm sorry but I"m not going to be able to take this on at the moment (and I really don't know much about python)

@jkitzes
Copy link

jkitzes commented Nov 18, 2018

GENERAL COMMENTS

In this manuscript, the authors create a up-to-date Python implementation of an analysis of several different resampling methods, originally published by Verbyla and Litvaitis. They verify the authors' original conclusion, which is that resubstitution produces biased estimates of classifier error, while several other procedures (including cross-validation, jacknifing, and bootstrapping) produce unbiased estimates. This test is done in the context of a species distribution model, but it applies more generally as well.

I found the manuscript clear and the results reasonable. I would consider the original study to be successfully replicated, while noting that this new implementation actually goes beyond the original analysis and extends its findings in some respects. The code in this PR actually does not run on my system, but with the two modifications below, it runs successfully and reproduces the key figure.

CODE COMMENTS

While I did not go through the code line-by-line to verify its accuracy, the code is written clearly and understandably overall. The implementation appears to follow what the authors describe in the manuscript. I did encounter two issues that needed to be fixed, however, at least on my system:

First, there appears to be one major bug. In the function peLDA, it appears that the function forgets to actually create an LDA object that has the methods fit and predict available. Changing this function to

# Define a function to calculate prediction errors from LDA model
def peLDA(trainPA, trainEV, testPA, testEV):
    # Create and fit model
    LDA_obj = LDA()
    LDA_obj.fit(trainEV, trainPA)
    # Predict against test data
    predicts = LDA_obj.predict(testEV)
    # Calculate and return the prediction errors
    return(np.abs(testPA - predicts))

will correct the problem.

Second, I did not have LaTeX installed locally on the machine where I tested this code, and the matplotlib figure will not save (throws an error) due to the line plt.rc('text', usetex=True). Changing this to False allows the figure to save - the text is not rendered as nicely, of course, but everything is still in the right place. I would suggest either setting this to False by default or noting that LaTeX is a dependency.

ARTICLE COMMENTS

A few fairly minor comments on the text:

  1. The authors appear to consistently mis-spell the second author's name in the paper they are replicating - it should be "Litvaitis" not "Litvatis".
  2. There's a vertical bar missing from Eq. 3.
  3. I think it would help to link the terminology here back explicitly to that used by Verbyla and Litvaitis. It's fairly obvious for someone used to looking at this, but it couldn't hurt to say things like "Verbyla and Litvaitis use 'Cross-validation' to refer to Hold-out cross-validation performed for one repitition", and so on for the other methods.
  4. While it's logical and useful to test three types of hold-out validation here, I would note that there does not appear to me to be any ambiguity about which case Verbyla and Litvaitis were envisioning - they state under "Cross-validation" that "only one estimate of accuracy is made", which to me indicates that they used H=1. This would also somewhat explain their general statement about the imprecision of the estimate of classification accuracy.
  5. I would suggest reordering the description of approaches in either the "Resampling methods" or "Computational experiment replication" section so that they occur in the same order in both.
  6. I would suggest stating, just for clarity, that the replication here used a LDA for prediction as well.

@tretherington
Copy link
Author

@tpoisot article correctly typeset now when you have some time to review

@tretherington
Copy link
Author

Hi @dimpase thanks for providing another test of the code 😃! I've added a comments about needing a dvipng installation in the README.

@laurajanegraham
Copy link

@tretherington Ah yes, I see that in the README now. That's absolutely fine then; nothing else that I would suggest. Also, not sure why I had such an out of date version of numpy!

@tpoisot I recommend this for acceptance

@tpoisot
Copy link

tpoisot commented Mar 21, 2019

@rougier following the two positive reviews, I'm happy to recommend acceptance of this article. I'm rusty on the next steps (and @tretherington will need to resolve the issues with the Makefile first anyways).

@rougier
Copy link
Member

rougier commented Mar 21, 2019

See http://rescience.github.io/edit/ but I copied it below (I can't believe the procedure is that complex. but this will much much simpler soon).

  • Lock the conversation on the original PR

  • Ask the author(s) for keywords if they haven’t provided them already.

  • Import the authors’ repository into the ReScience archives (https://github.com/ReScience-Archives) using the naming convention “Author(s)-YEAR”

  • Add a new remote (named rescience) in your local copy of the repository that points to the newly imported repository (the one on ReScience-Archives)

  • Update article metadata:

    • Editor name
    • Reviewer 1 name
    • Reviewer 2 name
    • Submission date
    • Publication date
    • Article repository
    • Code repository
    • Notebook repository (if necessary)
    • Data repository (if necessary)
    • Volume, issue and year
    • Article number (after reserving the number in the dedicated GitHub issue)
  • If the article name is not Author(s)-YEAR.md, rename it

  • Rebuild the PDF and check everything is OK

  • Merge the rescience branch into master

  • Push these changes onto the rescience remote repository

  • Make a new release:

    Release version number is 1.0
    Release name is Author(s)-YEAR-1.0

  • Download the zip file and rename it to Author(s)-YEAR-1.0.zip

  • Upload this zip file to Zenodo.
    You will have to fill several fields:

    • Name of the journal is ReScience
    • Under “Communities” add “ReScience journal”
    • Under “Contributors” add yourself with role “Editor”.
    • Don’t forget keywords
  • Announce publication in the PR (and quote the DOI, see
    #3 for example)

  • Make a PR to the Web site repository in order to update the page rescience.github.io/read. This requires creating a new post (directory _posts) based on this model because the page is composed from the posts of type article.

  • Make a PR to update Rescience/Volume X - Issue Y.md

  • Close the PR without merging

@tretherington
Copy link
Author

@tpoisot thanks for finding time for this. All should be good with the Makefile and pdf now, but do let me know if you need anything else from me.

@tretherington
Copy link
Author

Hi @tpoisot just checking if you are waiting on me for something - I think I've done everything, but please let me know if I've missed something

@tpoisot
Copy link

tpoisot commented Apr 17, 2019

@tretherington there are still conflicts with the Makefile -- when this is solved, @ReScience/editors will be able to assign an article number.

@tretherington
Copy link
Author

@tpoisot Aha, yes, same issue corrected in both mine and master Makefile, sorry should have spotted and realised that was an issue. All good now I think.

@tretherington
Copy link
Author

Hi @tpoisot, any chance this could be progressed? I appreciate you will be busy, but I have an end of project deadline coming up, and it would be great to have a published version of this for reporting purposes. I will help in whatever way I can!

@tpoisot
Copy link

tpoisot commented May 14, 2019

I missed the message with infos from @rougier -- I will go through the upload and publication on friday

@rougier
Copy link
Member

rougier commented May 23, 2019

@tretherington @tpoisot I just put the new website online such that the publication process should be easier. I can help on that. What I need at this point is the PDF of the article and a metadata file following this model: https://github.com/ReScience/template/blob/master/metadata.yaml.

@tretherington If you can fill in the metadata and give it back to me, I can probably publish it today. If you want to use the new article template design, it might necessitate a bit more work (from you) but we can also do it.

@tretherington
Copy link
Author

@rougier I like the look of the new publication process. I think basing things on LaTeX will smooth the process out significantly - I know I spent a lot of time trying to get pandoc working properly, and my paper was first written in LaTeX so I'm back to where I started!

I wasn't sure how best to create/submit the new PDF without (a) breaking the link to the review here, or (b) duplicating the paper by submitting again. So I'm hoping you might be able to figure out the best way to blend my submission under the old system into a paper in the new system. To help you do that I've created a new folder in my pull request that contains the metadata.yaml and content.tex needed to generate the PDF which I have also done (new paper layout/style looks good!).

I think the only things outstanding are some urls and dois for the metadata, but as I've said already, I was hoping you might be able to fill those in as you will understand better how to mesh the two systems.

Hope that is all OK, and please just let me know if you need anything else.

@rougier
Copy link
Member

rougier commented May 24, 2019

Perfect ! I will get the DOI from Zenodo and rebuild your PDF. Last thing is I need a code DOI (in case your repo disspaear sometime in the future). Can you deposit it on Zenodo and give me back the DOI?

@tretherington
Copy link
Author

@rougier many thanks!

I have deposited the repository on Zenodo with DOI: 10.5281/zenodo.3229408

@rougier
Copy link
Member

rougier commented May 27, 2019

Can you fill editor and reviewer name/orcid ? volume is 5, issue is 1 (and also, can you fill the code URL and DOI).

  • submission date + publication date (28/05/2019)

@tretherington
Copy link
Author

Hi @rougier I think that is all the metadata expect the article number, doi, and url

@rougier
Copy link
Member

rougier commented May 28, 2019

Can you check if https://sandbox.zenodo.org/record/294119 seems correct ?
(This is not the real DOI because this is Zenodo sandbox)

@tretherington
Copy link
Author

@rougier the article pdf looks correct to me!

@rougier
Copy link
Member

rougier commented May 29, 2019

Done! https://rescience.github.io/read/#volume-5-2019

@tretherington
Copy link
Author

Thanks for all your help @rougier ! Having now had some of my work replicated and having replicated someone else's work in ReScience, I've become a big fan of the journal.

@rougier rougier closed this May 30, 2019
@ReScience ReScience locked as resolved and limited conversation to collaborators May 30, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.