Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fabricate rewrite PR #41

Merged
merged 51 commits into from
Dec 20, 2017
Merged

Fabricate rewrite PR #41

merged 51 commits into from
Dec 20, 2017

Conversation

aaronrudkin
Copy link
Contributor

This is the PR that will merge in all of the new level syntax, including the cross_level stuff.

Travis:
Travis is currently failing on a specific test even though R CMD Check on my local system is working. @nfultz figures it's a weird Travis quirk, but it's POSSIBLE that there is an actual build issue (see builds 280, 281, 283). I'll be back this evening if I'm needed to address this stuff.

Notes about backwards compatibility:
The one thing that I am not 100% certain about in terms of replicating the existing behaviour of the old level function is the exact conditions under which a new ID label column is added to data. This is not breaking when it comes to any substantive stuff, it's just that a data frame might have an extra ID column relative to some past workflows. I began a discussion with @graemeblair about this in issue #33 -- I just haven't fully doubled back and verified exactly what's going on in the code.

Not done:
Vignettes related to cross-classifying data; draw_discrete changes

Code review / refactoring notes:
Some of the error handling and setup for the level functions is slightly duplicated and could possibly be abstracted into helper functions; the fabricate.r file should be split up into multiple files.

aaronrudkin and others added 30 commits November 2, 2017 15:31
…e, will implement similar strategy in the refactor in next few commits. This will generate a warning or error on the build.
…nit tests to reflect new names, bugfixes to correct unit tests
…avis but not on local builds, related to building the PDF manual.
…umentation for ICC functions, and fixed bug with normal ICC data.
Copy link
Contributor

@nfultz nfultz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks excellent and pretty well tested. The stuff with building a working environment could use a second pass (not using lists inside the env could simplifiy quite a bit of the book keeping) but if it's working we can always change it later. 👍 overall.

NAMESPACE Outdated
export(fabricate)
export(join)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close to overloading dplyr funs, but barely doesnt ;)

if(is.null(sigma)) {
if(is.atomic(rho) & length(rho)==1) {
if(ndim>2 & rho<0) {
stop("The correlation matrix must be positive semi-definite. In specific, ",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Specifically"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omg grammar correction in a code review

}

# Can we use the fast package or are we stuck with the slow one?
use_f = use_f && requireNamespace("mvnfast", quietly = TRUE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍


} else {
# Using mvnfast
correlated_sn = mvnfast::rmvn(N, ncores = 2, mu, sigma)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getOption("mc.cores")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, nuts, this is clearly an artifact from putzing around on my machine. Will fix.

# with right_chol to make it correlated.
correlated_sn <- matrix(rnorm(N * ndim),
nrow = N,
byrow = TRUE) %*% right_chol
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for byrow


# User is adding a new level, but already has a working data frame.
# Shelf the working data frame and move on
if("data_frame_output_" %in% names(working_environment_)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic is a bit hard to follow, going to meditate on it.

# Any obs that matches the level matching this i will be a duplicate of this i.
index_maps[
working_environment_$data_frame_output_[[ID_label]] == unique_values_of_level[i]
] = i
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yikes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm really open to a better implementation of this, I'm sure there is one.

# Move the current working data frame into a package
package_df = list(data_frame_output_ = working_environment_$data_frame_output_,
level_ids_ = working_environment_$level_ids_,
variable_names_ = names(working_environment_$data_frame_output_))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this snippet pops up a bit, might want to make a helper / constructor function.

R/fabricate.R Outdated
# Loop over the variable name

variable_names = by$variable_names
data_frame_indices = numeric(length(variable_names))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length always returns an int I believe.

# What would the indices of the quantiles be if our data was ordered --
# if the answer is below 0, set it to 1. round will ensure the tie-
# breaking behaviour is random with respect to outcomes
ordered_indices = pmax(1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be factored out of the lapply to a seperate sweep statement -


test_that("Deliberate failures in cross_level", {
expect_error(
test_next = fabricate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should not have a name (test_next) on this expect_error (test-crossclassified line 96) -

@aaronrudkin
Copy link
Contributor Author

aaronrudkin commented Dec 20, 2017

I believe I've implemented all of @nfultz's suggestions that were trivial to do, as well as bumped up the test coverage to hit all the code. The remainder of the code review suggestions I've issued. Provided everyone else is fine, this can be squashed and merged now.

Phew. Feels good to have over a month of hard work finally be merged into master.

@coveralls
Copy link

Coverage Status

Coverage increased (+23.2%) to 98.554% when pulling 289693b on fabricate_rewrite into fd88f3c on master.

@coveralls
Copy link

Coverage Status

Coverage increased (+24.5%) to 99.889% when pulling f33c26d on fabricate_rewrite into fd88f3c on master.

@coveralls
Copy link

Coverage Status

Coverage increased (+24.5%) to 99.889% when pulling ad6b54c on fabricate_rewrite into fd88f3c on master.

@graemeblair graemeblair merged commit 78f56bb into master Dec 20, 2017
@graemeblair graemeblair deleted the fabricate_rewrite branch January 18, 2018 07:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants