New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Summary of GSOC2019 calls #1
Comments
May 28, 2019 Update
|
hi @avinashbarnwal can you please post the source code and rendered images for the loss charts you created? |
looks like a good start but the plot is incorrect for uncensored data. it should look like the square loss around the label for the normal distribution. also you should try creating a facetted ggplot, with one panel per censoring type. for inspiration here is the code I used for the 1-page AFT poster, https://github.com/tdhock/aft-poster/blob/master/figure-loss.R |
Thanks, Prof. @tdhock I am looking into it. |
looks better @avinashbarnwal ! glad to see that you got the facetted ggplots working. however it looks like there is a problem with your computation of the logistic loss for the uncensored output -- it should be minimal at the label.
|
also for next week's homework please check your formulas for the gradient in the overleaf. add a row of plots to the figure that shows the loss functions. use
|
I have made the loss function with all the changes required. I have changed the code as well. Prof. @tdhock Please check the formula for the Normal - Uncensored part. I have used different formula compared to given in this https://github.com/avinashbarnwal/GSOC-2019/blob/master/paper/HOCKING-AFT.pdf, rather than this I have used the formula given in this document. (http://home.iitk.ac.in/~kundu/paper146.pdf). |
looks better Avinash. but why doesn't the uncensored loss go to zero? (it
should...)
about the normal - censored loss, you should double check your work by
using the normal CDF (pnorm in R)
…On Wed, Jun 5, 2019 at 6:47 AM Avinash Barnwal ***@***.***> wrote:
@tdhock <https://github.com/tdhock> @hcho3 <https://github.com/hcho3>
I have made the loss function with all the changes required.
Here is the link for the Plot -
https://github.com/avinashbarnwal/GSOC-2019/blob/master/AFT/R/loss_aft.png
I have changed the code as well.
Code Link -
https://github.com/avinashbarnwal/GSOC-2019/blob/master/AFT/R/assigment1.R
@ Please check the formula for the Normal - Censored part. I have used
different formula compared to
formula given in this
https://github.com/avinashbarnwal/GSOC-2019/blob/master/paper/HOCKING-AFT.pdf,
rather than this i have used formula given in this document. (
http://home.iitk.ac.in/~kundu/paper146.pdf)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1?email_source=notifications&email_token=AAHDX4QWXCGCFMV5LJLUBYDPY67WLA5CNFSM4HOPO5U2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODW7YEAI#issuecomment-499089921>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHDX4T5DS6SKKWT47LROO3PY67WLANCNFSM4HOPO5UQ>
.
|
Prof. @tdhock, The uncensored We have a constant term -log(1/(t.lower * sigma * sqrt(2*pi)) irrespective of y.hat. This makes it non-zero, this is based on my thinking. Similarly for Logistic. Sorry, I meant Normal Uncensored formula in the above. Normal Uncensored Old Formula - Normal Uncensored New Formula - |
right but we usually subtract away the constant terms. if you do that you
should recover the square loss which is 0 at the min.
…On Wed, Jun 5, 2019 at 10:22 AM Avinash Barnwal ***@***.***> wrote:
Prof. @tdhock <https://github.com/tdhock>, The uncensored
Loss function Normal- -log(1/(t.lower*sigma*sqrt(2
*pi))exp((log(t.lower/y.hat))**2/(-2sigma*sigma)))
We have a constant term -log(1/(t.lower*sigma*sqrt(2*pi)) irrespective of
y.hat. This makes it non-zero, this is based on my thinking.
Similarly for Logistic.
Sorry, I meant Normal Uncensored formula to check.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1?email_source=notifications&email_token=AAHDX4UKPXEFEYOE3EVFZBTPY7Y6ZA5CNFSM4HOPO5U2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXANOZQ#issuecomment-499177318>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAHDX4T65PX5QDPTNE52GYDPY7Y6ZANCNFSM4HOPO5UQ>
.
|
Please find the updated document for AFT(https://github.com/avinashbarnwal/GSOC-2019/blob/master/doc/Accelerated_Failure_Time.pdf). I think I have made mistake here as I am taking gradient and hessian with respect to y.hat, not X\beta. For least square loss function-gradient and hessian with respect to y.hat and X\beta are same. But it would start mattering when we have link functions involved. Similarly, for classification, we take gradient and hessian with respect to X\beta, not y.hat. For reference, please check doc - https://cran.r-project.org/web/packages/gbm/vignettes/gbm.pdf This leads to more decision on notations as in survival document we are using f for pdf and in the above document, we have f(x_i) for X\beta. Please let me know your thoughts. |
I dont think there is an issue because the link function is always the identity with the normal and logistic models. again we should be able to use the formulas in the survival manual. |
Prof. @tdhock, I think we need to calculate gradient with respect to log(y.hat) not y.hat as log(y.hat) = X\beta. Please let me know. |
section 6.8 of survival manual gives derivatives with respect to eta, which is the real-valued prediction. |
in your notation you use log(y.hat) for the real valued prediction, so that is the same as eta from the survival manual. in your PDF please do not use x\beta as that is only valid for linear models |
Prof. @tdhock, Thanks. I will use eta for log(y.hat) and take gradient and Hessian based on that. In my document, I was taking gradient and Hessian based on y.hat, not on eta. I will make the correction. |
Prof. @tdhock, @hcho3, I have created the document and plot for loss, negative gradient, and hessian. Document - https://github.com/avinashbarnwal/GSOC-2019/blob/master/doc/Accelerated_Failure_Time.pdf Please let me know your thoughts. |
the plot looks good except for the hessian for the interval censored outputs for small predictions -- it seems to be too large. can you please double check? it should look like right censored outputs (i.e. with finite lower limit) on the left side of the plot |
Thanks, Prof. @tdhock. I have changed the code and plot both. I had one sign in the formula wrong. Please recheck the plot (https://github.com/avinashbarnwal/GSOC-2019/blob/master/AFT/R/loss_grad_hess_aft.png). Do you think Interval hessian is correct for Normal distribution? |
Please find the python implementation of gradient boosting for AFT. Code - https://github.com/avinashbarnwal/GSOC-2019/blob/master/AFT/py/gb_aft.ipynb and let me know your thoughts. |
We'll need to check the POC implementation, since the training loss should trend down, not up. |
Please find the document for binomial loss and code below:- Document - https://github.com/avinashbarnwal/GSOC-2019/blob/master/doc/Binomial_Loss.pdf. |
there is still something wrong for the interval hessian for small predicted values on https://github.com/avinashbarnwal/GSOC-2019/blob/master/AFT/R/loss_grad_hess_aft.png -- it should be constant as the prediction goes to zero, log(prediction) goes to -Inf |
binomial loss looks reasonable, except for the x axis label. please (1) use facet_grid and (2) use more grid points and (3) maybe have different columns or colors for different labels |
Please check the R plots and Python Plots again - R Plots -
Python Plots-
|
how will that be specified in the user interface? right now the "survival:cox" objective is documented on https://xgboost.readthedocs.io/en/latest/parameter.html as "negative values are considered right censored" so that implies to me that the "survival:" prefix means all outputs are positive-valued. (un-logged survival times) to avoid confusion with that objective / output, I would suggest a new prefix for the three loss functions, "aft:gaussian" "aft:extreme" and "aft:logistic" -- does that make sense to you @hcho3 ? in each of these cases the output should be specfied by an interval (a,b) where both a and b are potentially negative. (log survival times) in other words, if you have an output vector |
@tdhock I think "survival:cox" uses survival time, not log survival time. So the output would be positive in that case. The current AFT implementation uses survival time as well. |
ok in that case the -0.1 output for survival:cox becomes (0.1, INFINITY) for us. however I would suggest specfying outputs as real valued (potentially negative) log survival times, which has two advantages in my opinion
maybe both aft:gaussian (using real valued log survival times) and survival:loggaussian (using positive valued un-logged survival times) etc can be supported? |
@tdhock That's reasonable suggestion. We will need to revise |
I am going through the document https://github.com/avinashbarnwal/GSOC-2019/blob/master/paper/AFT/survival.pdf shared by you. In page number 82, we have Extreme distribution section where it says "If y is Weibull then log(y) is distributed according to the (least) extreme In previous cases - we have normal and logistic, where y is following normal and logistic. Please let me know if i am missing anything. I am trying to test extreme distribution using a notebook for the first few datasets. I am finding inf in errors. ScreenShot of Gradient and hessian I have used all the formulas mentioned in the document. |
the loss function for the extreme value dist goes to Inf very fast as the predicted value gets smaller. if it helps, there is an alternative way to compute the loss (my.cost in code below) which is more numerically stable (results in Inf less often):
|
Hi Prof @tdhock, Please let me know what kind of censoring we have in the above formula? Looks like it is right censoring. Do we have different stabilizing techniques for a different kind of censoring? Please let me know what we are using finally for y in 3rd distribution extreme or Weibull? |
I have added comment to f52c967#r34598175. |
in my opinion the y values (at least internally) should be on the real-valued (potentially negative) scale, for all distributions, so the 3rd distribution should be "extreme" that being said if you run into issues with extreme I think it would be more useful to just have a working prototype for the other two distributions. |
I have kept "extreme" as 3rd distribution. I have changed the conditions to detect the censoring type as follows:-
Please let me if you think this is correct. Rest of the code is same and I haven't made any changes to "Extreme" Distribution. |
i'm not sure that is correct. y_lower and y_higher should be log survival times (possible negative). if y_lower == y_higher then un-censored (use pdf)
in particular I would suggest using INFINITY which is defined in math.h I don't think you should use the 1e-12 constant which is totally arbitrary. |
Hi Prof. @tdhock, We have been using y as un-logged values. Therefore, a comparison with 0 is done. I would change the y_lower and y_higher in logged format internally. I saw that in your datasets you have log-transformed responses and to test I took anti-log of responses. Here, we expect users to enter survival times or logged transformed response. I am not sure about this? |
in survival::survreg the user can specify either kind of outputs. For me the logged outputs make more sense because anyway that is what we use internally. |
Thanks. I will change the y_lower and y_higher to log-transformed values and the conditions to detect the censoring types. |
I have pushed the following changes:-
I have tested here - https://github.com/avinashbarnwal/GSOC-2019/blob/master/AFT/test/data/neuroblastoma-data-master/src/notebook/002_xgboost_test.ipynb. We have similar results as before. |
good. is there a PR where we can view a summary of all changes you want to merge? (files changed) |
I am trying to write the R vignette and i am finding this error. - Error in setinfo.xgb.DMatrix(dtrain, "label_lower_bound", y_lower) : Please find the R code - https://github.com/avinashbarnwal/GSOC-2019/blob/master/AFT/R/doc.R Any help would be great. |
sounds like an xgboost error, @hcho3 should know |
is it normal in xgboost to specify the outputs/labels as attributes of the input/feature matrix? at least in R that is pretty confusing. the usual ways in R are
|
@tdhock The mistake was due to a badly chosen |
@tdhock I do not think XGBoost package supports the use of a formula. Currently, users are asked to create an object of type |
Hi Prof. @tdhock, @hcho3 helped in resolving the issue. We have a working vignette here - https://github.com/avinashbarnwal/GSOC-2019/blob/master/AFT/R/doc.R. I am structuring more. |
red flag! why doesn't sigma=1 work? I think it should be fine for the data sets I have provided. sigma=10 seems too big to me -- there is almost no flat region for interval-censored outputs. s <- 1;curve(-log(pnorm(10,x,s)-pnorm(5,x,s)),0, 20) |
Hi Prof. @tdhock, I think we need to see the standard deviation of log(survival times) and we might use something similar sigma as starting point. We tested it with very less sigma and it was giving very less predicted value with very high log(survival times). For example - log(survival times) ~ 10 and y hat was 0.03. I think this might not be correct. Please let me know your thoughts. |
@tdhock @avinashbarnwal Maybe we need set a bigger value for |
more generally you could set it to the mean of all the finite limits? |
I have included base_score as average and sigma as the standard deviation of the log(survival_times). |
21MAY2019
Relevant Information -
Ans :- Create two column for all the times
2-column matrix representation of outputs=labels.
There is no need for the survival object. This is more of a legacy.
How to handle sigma in AFT?
Ans:- Treat as a hyperparameter.
What is the dimension of the predicted value of interval regression?
Ans:- It is always one real value, not interval.
Relevant Document
https://github.com/tdhock/aft-poster/blob/master/HOCKING-AFT.pdf
http://members.cbio.mines-paristech.fr/~thocking/survival.pdf
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-9-269
Please Check.
@tdhock, @hcho3
The text was updated successfully, but these errors were encountered: