Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Odd changes in dependency hashes #161

Closed
kendonB opened this issue Nov 20, 2017 · 11 comments
Closed

Odd changes in dependency hashes #161

kendonB opened this issue Nov 20, 2017 · 11 comments

Comments

@kendonB
Copy link
Contributor

kendonB commented Nov 20, 2017

Unable to reproduce this but it is concerning. Was there a recent change that might have caused this?

> dp <- dependency_profile(target = "my_target", config = my_config)
> dp$hashes_of_dependencies
                      any               base::match             dplyr::filter
       "743ca7b5940834b2"        "bcb74ceb208b327d"        "e61d3bdce0e7f2a6"
            dplyr::mutate             dplyr::select        extraction_indices
       "1931c6e41eeef8c0"        "d23c29e93b8788db"        "c85d88fc56f4e042"
                   ifelse                     is.na                    length
       "e82fad7bbae41e69"        "6d117c94ba589531"        "8c0b92110c775625"
        raster::getValues            raster::raster                       sum
       "9c33f636c1ef3496"        "74bed7a08fc9b200"        "b772c3b3e648dbed"
              sum_weights                   weights
       "c85d88fc56f4e042"        "efd85293746e3351"
>
> map_chr(names(dp$hashes_of_dependencies), function(x) my_config$cache$get_hash(key = x)) %>%
+   set_names(names(dp$hashes_of_dependencies))
                      any               base::match             dplyr::filter
       "15f4a4ccbb300aa8"        "3d423d56f81bc498"        "8f41a5d596b3c891"
            dplyr::mutate             dplyr::select        extraction_indices
       "ab17fdc46a5e66c3"        "3160b07f69954005"        "c85d88fc56f4e042"
                   ifelse                     is.na                    length
       "b406fb0764a8812f"        "1d649050f466e8ee"        "3462a4fc03fca6a9"
        raster::getValues            raster::raster                       sum
       "559202ee38a46eb2"        "9c2bf33acbf9badb"        "623861fcb1c7db1a"
              sum_weights                   weights
       "c85d88fc56f4e042"        "054479aec40580fc"
@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Nov 20, 2017

I am glad you are checking dependency_profile(), your vigilance is quite helpful.

I think the hashes you want are in the "kernels" namespace:

map_chr(
  names(dp$hashes_of_dependencies),
  function(x){
    my_config$cache$get_hash(key = x, namespace = "kernels")
  }
) %>%
  set_names(names(dp$hashes_of_dependencies))

I think this is another change from #129, which introduced some new vocabulary. In drake, a "kernel" is the piece of the target/import that is reproducibly tracked. For ordinary R objects, the kernel is just the object itself. But for functions, the kernel is a deparsed function body, along with the dependency hash if the function is imported (see store_function()).

In e58afd1, I did change the behavior of tidy_command() (previously tidy()). There was an edge case in parse(), so I reluctantly moved to formatR::tidy_source(), which I think is for the best in the long run. But that should have only affected commands, not dependency hashes or imports.

Whether or not any of this explains what you see, I do not plan to revert any changes, so I am closing this as an issue. Of course, let's keep talking on the thread. I apologize if I broke your project yet again.

@kendonB
Copy link
Contributor Author

kendonB commented Nov 21, 2017

I rebuilt my project from scratch after #129 (I deleted .drake), so that shouldn't explain this.

I do see the hashes match when using your code. However, I see that the target is outdated when using outdated and I can't see why drake thinks that. Help?

@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Nov 21, 2017

If the dependency hashes match, then it could be e58afd1 after all. Does dependency_profile("my_target", my_config)$cached_command match drake:::tidy_command(my_config$plan$command[my_config$plan$target == "my_target"])? Assuming your target is an ordinary R object and not a file, that would be the easiest next guess. See should_build_target().

@kendonB
Copy link
Contributor Author

kendonB commented Nov 21, 2017

Those also match:

> dependency_profile("my_target", my_config)$cached_command ==
+ drake:::tidy_command(my_config$plan$command[my_config$plan$target == "my_target"])
[1] TRUE
>

@wlandau-lilly
Copy link
Collaborator

Strange. Is cached_dependency_hash different from current_dependency_hash in dependency_profile()?
If not, you might want to debug(drake:::should_build_target) (or better yet, insert if(target == "my_target") browser() at the top of should_build_target()) and then run outdated()?

@kendonB
Copy link
Contributor Author

kendonB commented Nov 21, 2017

My bad. I was looking at a downstream target at my last comment.

> identical(dependency_profile("my_other_target", my_config)$cached_command,
+ drake:::tidy_command(my_config$plan$command[my_config$plan$target == "my_other_target"]))
[1] FALSE

Any suggested workaround? A migrate function?

@wlandau-lilly
Copy link
Collaborator

wlandau-lilly commented Nov 21, 2017

In 2540a65, I modified the supporting functions of migrate_drake_project() to cover projects from the latest CRAN release. To build in migration functions to accommodate changes in the development version probably would probably be tedious and impractical. I am sorry, but you may just have to manually cache the new commands, something like the following.

for(target in my_config$plan$target){
  new_command <- my_config$plan$command[my_config$plan$target == target] %>%
    drake:::tidy_command()
  my_config$cache$set(key = target, value = new_command, namespace = "commands")
}

I hate to make changes that break back compatibility, even between commits of the development version. In this case, however, relying on formatR::tidy_source() in tidy_command() is the right thing to do. It is unlikely to have the weird edge cases of any custom parsing code I write, and the package itself is stable. And since the next CRAN release won't be back compatible anyway, it's now or never.

@kendonB
Copy link
Contributor Author

kendonB commented Nov 21, 2017

No worries! I was mostly concerned about some unknown fundamental problem and this is not that. Thanks heaps, Will

@wlandau-lilly
Copy link
Collaborator

You're welcome, Kendon. I am glad you spoke up.

@kendonB
Copy link
Contributor Author

kendonB commented Nov 21, 2017

@wlandau-lilly, you need to unquote the "target" above for any future googlers.

@wlandau-lilly
Copy link
Collaborator

Yes, fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants