Bad wrapping of long character arguments #21

Closed
eusebe opened this Issue Jul 28, 2011 · 8 comments

Comments

Projects
None yet
3 participants
@eusebe

eusebe commented Jul 28, 2011

When there is a default argument of class character, there is sometimes a extra "\n" character in the \usage section which leads to a "Mismatches argument default value" warning in R CMD check.

R code:

##' foo
##'
##' @param a a
##' @param b b
##' @param c c
foo <- function(a = "sdmldsflkjmsqdlkfjk", b = "fmqlk", c = "\n\np. \n\n") {
  return(NULL)
}

\usage section:

\usage{
  foo(a = "sdmldsflkjmsqdlkfjk", b = "fmqlk", c = "\n\np.
  \n\n")
}

Please not the newline for the 'c' argument.

R cmd check:

* checking for code/documentation mismatches ... WARNING
Codoc mismatches from documentation object 'foo':
foo
  Code: function(a = "sdmldsflkjmsqdlkfjk", b = "fmqlk", c = "\n\np.
                 \n\n")
  Docs: function(a = "sdmldsflkjmsqdlkfjk", b = "fmqlk", c = "\n\np.\n
                 \n\n")
  Mismatches in argument default values:
    Name: 'c' Code: "\n\np. \n\n" Docs: "\n\np.\n  \n\n"

@hadley

This comment has been minimized.

Show comment Hide comment
@hadley

hadley Jul 28, 2011

Collaborator

This is because we're using str_wrap to wrap long usage lines, and it doesn't know not to break on whitespace in strings. We need a better algorithm for line-wrapping code.

An initial thought would be to split the usage string on commas, compute the length of each string + 3 (for the comma and indent), compute cumulative sum and then break using modulo arithmetic. (Probably a bit too simple)

Collaborator

hadley commented Jul 28, 2011

This is because we're using str_wrap to wrap long usage lines, and it doesn't know not to break on whitespace in strings. We need a better algorithm for line-wrapping code.

An initial thought would be to split the usage string on commas, compute the length of each string + 3 (for the comma and indent), compute cumulative sum and then break using modulo arithmetic. (Probably a bit too simple)

@eusebe

This comment has been minimized.

Show comment Hide comment
@eusebe

eusebe Jul 28, 2011

And if the character string contains a comma?

str_wrap does a pretty good job in general. Perhaps the best solution here is to override the default usage string 'by hand' with @Usage tag (and leave it 'untouch' by str_wrap in this case)?

eusebe commented Jul 28, 2011

And if the character string contains a comma?

str_wrap does a pretty good job in general. Perhaps the best solution here is to override the default usage string 'by hand' with @Usage tag (and leave it 'untouch' by str_wrap in this case)?

@eusebe

This comment has been minimized.

Show comment Hide comment
@eusebe

eusebe Jul 28, 2011

Another solution could be to format correctly the string inside the usage() function. A very ugly solution inspired by your comment:

usage <- function (args) {
    is.missing.arg <- function(arg) {
        is.symbol(arg) && deparse(arg) == ""
    }
    arg_to_text <- function(arg) {
        if (is.missing.arg(arg))
            return("")
        text <- deparse(arg, backtick = TRUE, width.cutoff = 500L)
        text <- str_replace_all(text, fixed("%"), "\\%")
        paste(" = ", paste(text, collapse = "\n"), sep = "")
    }
    vectortolist <- function(x, l) {
      res <- list()
      xx <- x
      for (i in l) {
        res <- c(res, list(xx[1:i]))
        xx <- xx[-(1:i)]
      }
      res
    }
    arg_values <- vapply(args, arg_to_text, character(1))
    ## paste(names(args), arg_values, collapse = ", ", sep = "")
    results <- paste(names(args), arg_values, sep = "")
    cuml <- cumsum(nchar(results))
    breaks <- table(cuml %/% 60)
    results <- vectortolist(results, breaks)
    paste(lapply(results, paste, collapse = ", "), collapse = ", \n")
}

eusebe commented Jul 28, 2011

Another solution could be to format correctly the string inside the usage() function. A very ugly solution inspired by your comment:

usage <- function (args) {
    is.missing.arg <- function(arg) {
        is.symbol(arg) && deparse(arg) == ""
    }
    arg_to_text <- function(arg) {
        if (is.missing.arg(arg))
            return("")
        text <- deparse(arg, backtick = TRUE, width.cutoff = 500L)
        text <- str_replace_all(text, fixed("%"), "\\%")
        paste(" = ", paste(text, collapse = "\n"), sep = "")
    }
    vectortolist <- function(x, l) {
      res <- list()
      xx <- x
      for (i in l) {
        res <- c(res, list(xx[1:i]))
        xx <- xx[-(1:i)]
      }
      res
    }
    arg_values <- vapply(args, arg_to_text, character(1))
    ## paste(names(args), arg_values, collapse = ", ", sep = "")
    results <- paste(names(args), arg_values, sep = "")
    cuml <- cumsum(nchar(results))
    breaks <- table(cuml %/% 60)
    results <- vectortolist(results, breaks)
    paste(lapply(results, paste, collapse = ", "), collapse = ", \n")
}
@hadley

This comment has been minimized.

Show comment Hide comment
@hadley

hadley Jul 28, 2011

Collaborator

Would you mind also supply a couple of test cases? (And I think you can simplify your code by doing results <- split(results, cuml %/% 60) )

Collaborator

hadley commented Jul 28, 2011

Would you mind also supply a couple of test cases? (And I think you can simplify your code by doing results <- split(results, cuml %/% 60) )

@eusebe

This comment has been minimized.

Show comment Hide comment
@eusebe

eusebe Jul 29, 2011

Good point! Here it is:

usage <- function (args)
{
    is.missing.arg <- function(arg) {
        is.symbol(arg) && deparse(arg) == ""
    }
    arg_to_text <- function(arg) {
        if (is.missing.arg(arg))
            return("")
        text <- deparse(arg, backtick = TRUE, width.cutoff = 500L)
        text <- str_replace_all(text, fixed("%"), "\\%")
        paste(" = ", paste(text, collapse = "\n"), sep = "")
    }
    arg_values <- vapply(args, arg_to_text, character(1))
    ## paste(names(args), arg_values, collapse = ", ", sep = "")
    results <- paste(names(args), arg_values, sep = "")
    cuml <- cumsum(nchar(results))
    results <- split(results, cuml %/% 60)
    paste(lapply(results, paste, collapse = ", "), collapse = ", \n")
}

Some tests:

foo <- function(a = TRUE, b = 1:10, c = "foo foo bar foo bar", d = expression(a == b), e = "foo bar again") {
  return(NULL)
}
bar <- function(a = "sdmldsflkjmsqdlkfjk", b = "fmqlk", c = "\n\np. \n\n", d = "sdj") {
  return(NULL)
}
> cat(usage(formals(foo)), sep = "\n")
a = TRUE, b = 1:10, c = "foo foo bar foo bar", 
d = expression(a == b), e = "foo bar again"
> cat(usage(formals(bar)), sep = "\n")
a = "sdmldsflkjmsqdlkfjk", b = "fmqlk", c = "\n\np. \n\n", 
d = "sdj"

I will try to include this in the package and to propose a real git patch this week end (I can't from my work computer).

eusebe commented Jul 29, 2011

Good point! Here it is:

usage <- function (args)
{
    is.missing.arg <- function(arg) {
        is.symbol(arg) && deparse(arg) == ""
    }
    arg_to_text <- function(arg) {
        if (is.missing.arg(arg))
            return("")
        text <- deparse(arg, backtick = TRUE, width.cutoff = 500L)
        text <- str_replace_all(text, fixed("%"), "\\%")
        paste(" = ", paste(text, collapse = "\n"), sep = "")
    }
    arg_values <- vapply(args, arg_to_text, character(1))
    ## paste(names(args), arg_values, collapse = ", ", sep = "")
    results <- paste(names(args), arg_values, sep = "")
    cuml <- cumsum(nchar(results))
    results <- split(results, cuml %/% 60)
    paste(lapply(results, paste, collapse = ", "), collapse = ", \n")
}

Some tests:

foo <- function(a = TRUE, b = 1:10, c = "foo foo bar foo bar", d = expression(a == b), e = "foo bar again") {
  return(NULL)
}
bar <- function(a = "sdmldsflkjmsqdlkfjk", b = "fmqlk", c = "\n\np. \n\n", d = "sdj") {
  return(NULL)
}
> cat(usage(formals(foo)), sep = "\n")
a = TRUE, b = 1:10, c = "foo foo bar foo bar", 
d = expression(a == b), e = "foo bar again"
> cat(usage(formals(bar)), sep = "\n")
a = "sdmldsflkjmsqdlkfjk", b = "fmqlk", c = "\n\np. \n\n", 
d = "sdj"

I will try to include this in the package and to propose a real git patch this week end (I can't from my work computer).

@yihui

This comment has been minimized.

Show comment Hide comment
@yihui

yihui Aug 11, 2011

Contributor

beside \usage{}, another danger of wrapping the texts is it may mess up the text inside \preformatted{}, which should not be touched by definition

Contributor

yihui commented Aug 11, 2011

beside \usage{}, another danger of wrapping the texts is it may mess up the text inside \preformatted{}, which should not be touched by definition

@eusebe

This comment has been minimized.

Show comment Hide comment
@eusebe

eusebe Sep 20, 2011

I've seen this commit
But the usage string is still wrapped with the development version of roxygen (str_wrap() is used in format_collapse() function).

eusebe commented Sep 20, 2011

I've seen this commit
But the usage string is still wrapped with the development version of roxygen (str_wrap() is used in format_collapse() function).

@hadley

This comment has been minimized.

Show comment Hide comment
@hadley

hadley Sep 20, 2011

Collaborator

@eusebe - that commit is unrelated to this problem.

Collaborator

hadley commented Sep 20, 2011

@eusebe - that commit is unrelated to this problem.

@hadley hadley closed this in c3323a5 Oct 25, 2011

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment