Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

untracked: Allow user to specify more aggregation functions, include … #54

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

oliverbock
Copy link
Contributor

…coundistinct

@oliverbock
Copy link
Contributor Author

@jrwishart ?

@jrwishart
Copy link
Contributor

@jrwishart ?

Apologies. I was away the entire month of April and this slipped through the cracks on my return. Looking now.

Copy link
Contributor

@jrwishart jrwishart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only major thing is to clarify the distinctBy vs distinct_by naming as I fear this might be an unintended bug?
Otherwise minor, non-blocking quibbles.

R/Factbase.R Outdated
Comment on lines 122 to 123
if (!is.null(distinct_by))
metric$distinctBy <- distinct_by
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if check is unnecessary. NULL assignments to lists only occur in the constructing call of a list. When modifying, assigning NULL to a list element is the same as removing that element.

> list(x = 1, y = 2, z = NULL)
$x
[1] 1

$y
[1] 2

$z
NULL

> l <- list(x = 1, y = 2)
> l$z <- NULL
> l
$x
[1] 1

$y
[1] 2

Also related is my confusion with distinct_by and distinctBy (there might be a good reason for this but I was confused in a later comment too).

R/Factbase.R Outdated
Comment on lines 150 to 155
return (list(aggregation = aggregation, distinct_by=NULL))
distinct_by <- str_match(aggregation, 'countdistinct\\(([^)]+)\\)')[1, 2]
if (is.character(distinct_by)) {
if (!(distinct_by %in% names(data)))
stop(paste0("Column '", distinct_by, "' is referred to in 'aggregation' but does not exist in 'data'"))
return (list(aggregation = 'count', distinctBy = distinct_by))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return list has distinct_by in the first version and distinctBy in the second. I'm guessing one of these needs to be changed for consistency.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Thanks.

Comment on lines +146 to +148
validate_aggregation <- function(aggregation, data) {
if (!is.character(aggregation) || length(aggregation) != 1)
stop("'aggregation' must be a character vector of length 1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is data always intended to be a data.frame? If so, could add a validation check like the aggregation has?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is done separately in validate_dataframe() (the data frame is passed in here knowing that it has already been validated)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants