Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Review Request: Mondeel, Ogundipe, Westerhoff #41
I request a review for the following replication:
Title: Predicting metabolic biomarkers of human inborn errors of metabolism
Author(s): Thierry D.G.A. Mondeel, Vivian Ogundipe, and Hans V. Westerhoff
Potential editors: Timothée Poisot, C. Titus Brown, Karthik Ram
Hi Titus, Metabolomics is not really my area of expertise, but I should still be able to take a look at it. I'm happy to help. Federico…
On Tue, 7 Nov 2017 at 10:49 Aaron Shifman ***@***.***> wrote: @ctb <https://github.com/ctb> I'b be happy to do it — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#41 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAmddkgLBrGvWYkjHKB1NFxOpsaev1Wmks5s0KZKgaJpZM4QHn5w> .
@ThierryMondeel I've split this review into 3 independent sections for the purpose of clarity. Overall I found the code well written and presented. I had some small issues with the manuscript that should be relatively simple to address (only after sending this I noticed that latex doesn't render in the comments, so my apologies)
The code in general is very well written and quite well commented and that is always appreciated, generally list comprehensions are less desirable than a vectorized statement - however given your use of cobra package these are unavoidable. I personally am also not a fan of the notebook format, but the fact the the critical component (the algorithm) is in a separate importable module more or less waives that concern. I do have some overall critiques of the code which should be quite simple to fix
I'm not a metabalomics person, so feel free to tell me that some of these comments are not warranted. Nevertheless, I do believe that in places things could be made clearer.
First off, the introduction is very direct, FVA is never actually defined aside from a numerical algorithm, most of your definitions in the methods section (boundary metabolites, exchange reactions, \ldots) I think could go into the introduction. This would make a much more gradual introduction as well as bringing everyone up to the same place. Also setting up the problem a bit more (giving context) could help.
I do want to highlight a concern with the manuscript which is how the replication is implemented. My first concern is that you say you replicate figures 1, and 2 in the manuscript when in reality figure 1A is replicated and figure 2 is presented in a table that doesn't quite match. For example what the original text calls ARGININEMIA you call Arginase deficiency - I imagine they're synonymous (google seems to think so), but the closer you can make your replication to the original results the easier it is to have confidence in the replication. Secondly I don't understand why this was presented as a table instead of a figure as done in the original.
With regards to the flux variability calculation section, I found it confusing in places.
I think this if these concerns were addressed that I would be better able to follow the IEM algorithm you present later.
My last concern is with your sensitivity analysis. While I applaud your going above and beyond so to speak, it seems to take up almost half of your replication. Given that this is a replication paper, I feel that the same result could have been attained with a few sentences (we tried some slightly permuted topologies \ldots different results \ldots method is only as good as the pathway annotation \ldots). Something like this would clarify that you did look into it but it's not the main focus (the replication is).
Lastly in your references some references use the form of page numbers 1000-1020 and some use the form 1000-20. If you could please make that consistent.
The code does run and does seem to replicate the results found in the manuscript. Due to my not quite understanding the algorithm, I'm having difficulty picking apart the code. However given the care put into putting together the code and that it does seem to replicate the results - my guess is that I won't find anything pathological with further analysis.
There is a part of this about the reference implementation being easy to use. I didn't have tqdm installed on any of my environments. Yes it does only take 15s to install, but it's an extra step - and at least for me it hasn't added much. Whether or not it's included doesn't sway my opinion in either direction - it's just something to consider.
Hello @ThierryMondeel thank you for the changes you made after the last round of revisions, I find that they make the paper much more understandable. I still have some minor comments
@aaronshifman Thanks! See below.
Ok, I removed this sentence. The sentence following it should explain: "Positive flux indicates net secretion of $X$ by the cell and negative flux indicates the net uptake of $X$."
X is is typically the extracellular form of the metabolite. Take the example of lactate: it would be produced intracellularly, be transported outward across the membrane and then efflux through the exchange reaction signifying net production.
Ok, slightly rephrased this now.
Yes ok, I see your point. I reverted to the original paper's notation.
Good catch. Fixed.
Thanks, good suggestion, included now.
I agree, I fixed the ordering. I prefer to keep calling the reactions EX_M* since that is more standard in the field and more clearly emphasizes that they are exchange reactions.
Following this comment, I tried to think on what I could do to investigate what the source of the differences might be. Recall that I used to set the influx of medium components to -1. The reasoning was just that this worked to produce the results in Figure 2. Not so for Figure 3, as you and we pointed out.
So I tried changing the influx to -10. And what do you know! The Figure 2 results are still upheld and Figure 3 now gives the same predictions.
I adjusted the change to -10 in the text and I included the -1 vs. -10 simulations in the notebook of Figure 2 and discuss them in the text now.
Thanks! without your "push" I wouldn't have figured this out ;)
Yes you are correct. Figure 1 and Figure 3 in our reproduction are made in Illustrator based on the numerical results from the algorithm. I now mention this in the text.
I'm not sure what you have in mind here. The output is already in Figure 3B. The output of the algorithm is the 4 intervals depicted in black and red and I included the numerical values for the interval. No more information comes out of the algorithm. I simply mentioned the table that the notebook generates for readers that want to see exactly where the output is generated and whether I copied it correctly to the figure.
I apologize for the delay in reviewing the last set of changes.
I'm completely satisfied with current state of the work - in particular, having findBiomarkers.py in the main repo makes it much easier to follow through the logic of the implementation. I would recommend accepting the reproduction.