Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Gaussian Covariance filter #148

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

MislavSag
Copy link
Contributor

Adding Gaussian Covariance 'filter' from the package https://cran.r-project.org/web/packages/gausscov/gausscov.pdf.

I am getting error when running test and examples but can't figure out why.

Error in .__Param__assert(self = self, private = private, super = super, :
Assertion on 'x' failed: Element 1 is not >= 1.

It seems like my filter is not in mlr_filter list.

The function works as expected when tried tu instantiate class and run calculate and score.

I am not sure if classif models are supported, so I added regr only. I can try to contact the author.

Missing values are not allowed.

@MislavSag
Copy link
Contributor Author

MislavSag commented Jan 31, 2023

I have just checked examples in gausscov package and it has example with binary covariate. So, it works for classification too. But the target variable has to be a matrix, not factor. I can add classif example after you review initial PR.

kex should be lower 0:
```
kex  = p_int(lower = 0, default = 0),
```
Same as before..
@sebffischer
Copy link
Sponsor Member

Sorry for not responding here (I did not see it).

Are you still interested in contributing this filter?

@MislavSag
Copy link
Contributor Author

Yes. I will send last version of the pipe. I think I have changed something till PR.

Is there anything I should add to current commit?

@sebffischer
Copy link
Sponsor Member

When I run the test from the the pull request, I get a lot of NA values, can you explain why this happens?

@MislavSag
Copy link
Contributor Author

I can't make new PR for some reason, but can you try this code:

FilterGausscovF1st = R6::R6Class(
  "FilterGausscovF1st",
  inherit = mlr3filters::Filter,

  public = list(

    #' @description Create a GaussCov object.
    initialize = function() {
      param_set = ps(
        p0   = p_dbl(lower = 0, upper = 1, default = 0.01),
        kmn  = p_int(lower = 0, default = 0),
        kmx  = p_int(lower = 0, default = 0),
        mx   = p_int(lower = 1, default = 21),
        kex  = p_int(lower = 0, default = 0),
        sub  = p_lgl(default = TRUE),
        inr  = p_lgl(default = TRUE),
        xinr = p_lgl(default = FALSE),
        qq   = p_int(lower = 0, default = 0)
      )

      super$initialize(
        id = "gausscov_f1st",
        task_types = c("classif", "regr"),
        param_set = param_set,
        feature_types = c("integer", "numeric"),
        packages = "gausscov",
        label = "Gauss Covariance f1st",
        man = "mlr3filters::mlr_filters_gausscov_f1st"
      )
    }
  ),

  private = list(
    .calculate = function(task, nfeat) {
      # debug
      # pv = list(
      #   p0   = 0.01,
      #   kmn  = 0,
      #   kmx  = 0,
      #   mx   = 21,
      #   kex  = 0,
      #   sub  = TRUE,
      #   inr  = TRUE,
      #   xinr = FALSE,
      #   qq   = 0
      # )

      # empty vector with variable names as vector names
      scores = rep(-1, length(task$feature_names))
      scores = mlr3misc::set_names(scores, task$feature_names)

      # calculate gausscov pvalues
      pv = self$param_set$values
      x = as.matrix(task$data(cols = task$feature_names))
      if (task$task_type == "classif") {
        y = as.matrix(as.integer(task$truth()))
      } else {
        y = as.matrix(task$truth())
      }
      res = mlr3misc::invoke(gausscov::f1st, y = y, x = x, .args = pv)
      res_1 = res[[1]]
      res_1 = res_1[res_1[, 1] != 0, , drop = FALSE]
      scores[res_1[, 1]] = abs(res_1[, 4])

      # save scores
      dir_name = "./gausscov_f1"
      if (!dir.exists(dir_name)) {
        dir.create(dir_name)
      }
      random_id <- paste0(sample(0:9, 15, replace = TRUE), collapse = "")
      file_name = paste0("gausscov_f1-", task$id, "-", random_id, ".rds")
      file_name = file.path(dir_name, file_name)
      saveRDS(scores, file_name)

      sort(scores, decreasing = TRUE)
    }
  )
)

@sebffischer
Copy link
Sponsor Member

You can't make a new PR from your main branch because you already have a PR open. You could e.g. make a new branch in your fork and then create a new PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants