Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkGeneSymbols ready for review #34

Closed
bagherig opened this issue Mar 23, 2017 · 1 comment
Closed

checkGeneSymbols ready for review #34

bagherig opened this issue Mar 23, 2017 · 1 comment

Comments

@bagherig
Copy link
Contributor

test_that("isGeneSymbol identifies invalid gene symbols", {

  • # Input contains invalid (non-HGNC) gene symbols.
    
  • input <- c("", "!@#", "AR3bsdt!", "123")
    
  • expected_output <- c(FALSE, FALSE, FALSE, FALSE)
    
  • expect_equal(isGeneSymbol(input), expected_output)
    
  • })

test_that("isGeneSymbol identifies correct gene symbols, and input is case insensitive", {

  • # Input contains correct HGNC gene symbols.
    
  • input <- c("A1BG", "a1bg", "A1bG", "a1Bg")
    
  • expected_output <- c(TRUE, TRUE, TRUE, TRUE)
    
  • expect_equal(isGeneSymbol(input), expected_output)
    
  • })

test_that("a corrupt input does not lead to corrupted output", {

  • # Input is corrupted (NULL, NA, vector of length zero, or no input at all).
    
  • expect_error(isGeneSymbol())
    
  • expect_error(isGeneSymbol(c()))
    
  • expect_error(isGeneSymbol(NULL))
    
  • expect_error(isGeneSymbol(NA))
    
  • })
@hyginn
Copy link
Owner

hyginn commented Mar 23, 2017

It's easier for me to comment into the code if you create a pull request.

This is a simple function, so the tests are simple: reject everything that is not mode()/typeof()/and class() character and that has length() < 1 - then test that a few keys that are present in the table return TRUE, and some that are not present return FALSE.

Should NA in the input return NA or FALSE? By analogy to is.finite(c(1:3, NA, 4L)) I vote it should return FALSE, not NA.

A character string that is not in the table is not "invalid", but input values like TRUE, or 3.142, or a data frame would be invalid. That done, I can't imagine "corrupt" input, so unless you can think of some, we can skip that. "Corrupt" would mean that it could violate some of the internal assumptions of the function. But there really are none.

Testing the script that parses the HGNC table is another matter, but we can't really do that because we are not running the script through testthat(). So we'll just have to read it carefully.

Reviewed - write the tests.

@hyginn hyginn closed this as completed Mar 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants