Method to retrieve predictions from SSC #4

jllavin77 · 2022-01-11T14:57:16Z

Dear developers,

I was looking for a Semi-Supervised ML method in R and found your excellent package. I tried your example code adapting it to my input data, and after some reformating it works apparently well. The problem I have is related to how to access prediction results for each of the rows in my input table.
I may sound naive, but I can't find the code to access the classification assigned for each of the "unlabeled" rows in my table, by any of the methods carried out in your vignette's example code.
I can access the sumary of how many samples have been assigned to each class, but I'd like to know how to access to each row's individual class/label prediction (in dataframe format, for instance).
I hope I was able to explain myself clearly enough for everybody to understand this request.
Thanks in advance and congrats for your nice work.

mabelc · 2022-01-13T21:56:32Z

Thanks for your interest!
Please, use the predict method and supply the instances that were unlabeled during the training. That way you are using the transductive capabilities of the model because those instances were also seen during the training.
Hope I helped. If you still have questions don't hesitate to ask.

jllavin77 · 2022-01-14T09:21:23Z

My question is more related to having a function to obtain that information in table format. Using predict doesn't provide that info. You suggest to use predict on my unlabeled data, but, which model should I use for that prediction?
Could you provide an code example on that? Is it somethig similar to this snipet?

`######################REDUCED CODE######################

m <- selfTraining(x = xtrain, y = ytrain, learner = knn3, learner.pars = list(k = 1))

pred <- predict(m, xitest, interval="confidence")

summary(pred) `

Once I carry out this prediction, how do I get the data I'm really looking for, because this way I end up with a summary of the predictions, but no clue about which label corresponds to each row. Do you see what I mean?

mabelc · 2022-01-15T12:16:32Z

I think I understand what you are looking for. Could you please try this code? But if it is not solving your problem, please continue asking!

##Load Iris data set
data(iris)

x <- iris[, -5] # instances without classes
x <- as.matrix(x)
y <- iris$Species

##Prepare data, use 50% of instances for training
set.seed(1)
tra.idx <- sample(x = length(y), size = ceiling(length(y) * 0.5))
xtrain <- x[tra.idx,] # training instances
ytrain <- y[tra.idx] # classes of training instances

##Use 70% of train instances as unlabeled set
tra.na.idx <- sample(x = length(tra.idx), size = ceiling(length(tra.idx) * 0.7))
ytrain[tra.na.idx] <- NA # remove class information of unlabeled instances

##train selftraining with base classifiers knn3
m <- selfTraining(x = xtrain, y = ytrain, learner = knn3, learner.pars = list(k = 1))

##transductive test
##it's called transductive because we want to predict the instances that were unlabeled during the training
xttest = xtrain[tra.na.idx,]
pred.label <- predict(m, xttest)

##creating a matrix with the training data unlabeled + predicted labels by selftraining-knn3
xttest <- cbind(xttest, pred.label)
xttest

jllavin77 · 2022-01-18T14:49:26Z

Dear @mabelc,

Thank you very much for your piece of code. It works, and was exactly what I was asking for.

Just one more question, I have read the selfTraining function documentation and cannot figure out how to change the learner parameter from KNN3 to random forest, svm or any other classifier. Is there a list of the available classifiers explained somewhere?

Thanks in advance for your kind help.

mabelc · 2022-01-22T12:18:13Z

Hi,

In this paper https://cran.r-project.org/web/packages/ssc/vignettes/ssc.pdf you can find many examples with different learners. I have modified the previous example to use SVM as learner. Basically you can use learners from R ecosystem, the generic functions provided will help you with that. In the example I am using the generic version of selfTraining, named selfTrainingG.

library('ssc')
library('e1071')

##Load Iris data set
data(iris)

x <- iris[, -5] # instances without classes
x <- as.matrix(x)
y <- iris$Species

##Prepare data, use 50% of instances for training
set.seed(1)
tra.idx <- sample(x = length(y), size = ceiling(length(y) * 0.5))
xtrain <- x[tra.idx,] # training instances
ytrain <- y[tra.idx] # classes of training instances

##Use 70% of train instances as unlabeled set
tra.na.idx <- sample(x = length(tra.idx), size = ceiling(length(tra.idx) * 0.7))
ytrain[tra.na.idx] <- NA # remove class information of unlabeled instances

##wrapper functions to train a SVM
gen.learner <- function(indexes, cls)
e1071::svm(x = xtrain[indexes, ], y = cls, type='C-classification', probability=TRUE)

gen.pred <- function(model, indexes){
p <- predict(model, xtrain[indexes, ], probability=TRUE)
attr(p, "probabilities")
}

##train generic selftraining with SVM as base classifier
m <- selfTrainingG(y = ytrain, gen.learner, gen.pred)

##transductive test
##it's called transductive because we want to predict the instances that were unlabeled during the training
xttest = xtrain[tra.na.idx,]
pred.label <- predict(m$model, xttest)

##creating a matrix with the training data unlabeled + predicted labels by selftraining-knn3
xttest <- cbind(xttest, pred.label)
xttest

jllavin77 changed the title ~~How to obtain predictions~~ How to retrieve predictions Jan 12, 2022

jllavin77 changed the title ~~How to retrieve predictions~~ How to retrieve predictions from SSC Jan 12, 2022

jllavin77 changed the title ~~How to retrieve predictions from SSC~~ Method to retrieve predictions from SSC Jan 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Method to retrieve predictions from SSC #4

Method to retrieve predictions from SSC #4

jllavin77 commented Jan 11, 2022 •

edited

Loading

mabelc commented Jan 13, 2022

jllavin77 commented Jan 14, 2022

mabelc commented Jan 15, 2022 •

edited

Loading

jllavin77 commented Jan 18, 2022

mabelc commented Jan 22, 2022 •

edited

Loading

Method to retrieve predictions from SSC #4

Method to retrieve predictions from SSC #4

Comments

jllavin77 commented Jan 11, 2022 • edited Loading

mabelc commented Jan 13, 2022

jllavin77 commented Jan 14, 2022

mabelc commented Jan 15, 2022 • edited Loading

jllavin77 commented Jan 18, 2022

mabelc commented Jan 22, 2022 • edited Loading

jllavin77 commented Jan 11, 2022 •

edited

Loading

mabelc commented Jan 15, 2022 •

edited

Loading

mabelc commented Jan 22, 2022 •

edited

Loading