Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asearch for any local repository not just the default local repo #298

Closed
reichek opened this issue Oct 18, 2016 · 17 comments
Closed

asearch for any local repository not just the default local repo #298

reichek opened this issue Oct 18, 2016 · 17 comments

Comments

@reichek
Copy link

reichek commented Oct 18, 2016

Is it possible to enable asearch to search in any local repo specified as asearch(repo="PATH_TO_LOCAL_REPO") and not just in the default local repo?

@MarcinKosinski
Copy link
Collaborator

@reichek probably

sapply(character_vector_with_repository_dirs, function(one_repo){
  asearch(patterns, one_repo)
})

this will do this : )

@reichek
Copy link
Author

reichek commented Oct 18, 2016

Thanks for your prompt reply, but specifying the relative or absolute path of a local repo in asearch does not work for me.

>class(repoInputData)
[1] "character"

> showLocalRepo(repoDir=repoInputData)
                            md5hash                             name
1  8f784555be2d69d772a65734637a86ad                        list.data
2  18c1ed3744ca3a4e290621631400e90b 18c1ed3744ca3a4e290621631400e90b
3  8f784555be2d69d772a65734637a86ad                        list.data
4  ddfdb3eb402d0ebf3a631948a3310277 ddfdb3eb402d0ebf3a631948a3310277
5  ff575c261c949d073b2895b05d1097c3                        list.data
6  ddfdb3eb402d0ebf3a631948a3310277 ddfdb3eb402d0ebf3a631948a3310277

# Failure
>asearch(patterns = c("class:list"), repo=repoInputData)
Error: length(elements) >= 2 is not TRUE

# Failure
>asearch(patterns = c("class:list"), repo=normalizePath(repoInputData))
Error in downloadDB(remoteHook) : 
  Such a repo: [...] does not exist or there is no archivist-like Repository on this repo.

# Success
> setLocalRepo(repoDir=repoInputData)
> asearch(patterns = c("class:list"))
$`8f784555be2d69d772a65734637a86ad`
$`8f784555be2d69d772a65734637a86ad`$iris
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1            5.1         3.5          1.4         0.2     setosa
2            4.9         3.0          1.4         0.2     setosa
3            4.7         3.2          1.3         0.2     setosa
4            4.6         3.1          1.5         0.2     setosa

# session info
> sessionInfo()
R version 3.3.0 (2016-05-03)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 14.04.5 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] archivist_2.1 knitr_1.14    ggplot2_2.1.0 rmarkdown_1.0

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.7      rstudioapi_0.6   magrittr_1.5     devtools_1.12.0 
 [5] munsell_0.4.3    colorspace_1.2-6 R6_2.1.3         stringr_1.1.0   
 [9] httr_1.2.1       plyr_1.8.4       tools_3.3.0      grid_3.3.0      
[13] gtable_0.2.0     DBI_0.5-1        withr_1.0.2      htmltools_0.3.5 
[17] yaml_2.1.13      digest_0.6.10    formatR_1.4      bitops_1.0-6    
[21] RCurl_1.95-4.8   memoise_1.0.0    evaluate_0.9     RSQLite_1.0.0   
[25] labeling_0.3     stringi_1.1.1    scales_0.4.0     lubridate_1.6.0 

@MarcinKosinski
Copy link
Collaborator

Ok now I get this, so we need to set repo in sapply for every element

sapply(character_vector_with_repository_dirs, function(one_repo){
  setLocalRepo(repoDir=one_repo)
  asearch(patterns)
})

Is that better?

@MarcinKosinski
Copy link
Collaborator

Remember that at the end, the last element of character_vector_with_repository_dirs is now the default local repository.

@reichek
Copy link
Author

reichek commented Oct 18, 2016

Sorry, I see I need to be more precise. In the same Rmarkdown script I like to use a repo for the input data and another repo for all plots and results generated throughout the script. In order not to get confused between different settings of default local repositories, I would prefer to call archivist::asearch with a specified local repo directly (not resetting the default repo at each invocation of asearch). The same feature is enabled for repos on github. Here you enable searching in a specified github repo or in a default github repo.

@MarcinKosinski
Copy link
Collaborator

So using asearch with various local repositories can be only done with
the usage of setLocalRepo function,
as it's stated in the Note of ?asearch

Remember that if you want to use local repository you should set it to
default.

There isn't any way for local repositories right now. But maybe you could
share your results on a Remote repository? In this case you specify the
user and the github-repository each time, so there is no space for
confusion :)

2016-10-18 15:51 GMT+02:00 reichek notifications@github.com:

Sorry, I see I need to be more precise. In the same Rmarkdown script I
like to use a repo for the input data and another repo for all plots and
results generated throughout the script. In order not to get confused
between different settings of default local repositories, I would prefer to
call archivist::asearch with a specified local repo directly (not
resetting the default repo at each invocation of asearch). The same feature
is enabled for repos on github. Here you enable searching in a specified
github repo or in a default github repo.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#298 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGdazutmp8phY851nc0nuV1C3LyqTNnPks5q1M79gaJpZM4KZuOW
.

@reichek
Copy link
Author

reichek commented Oct 18, 2016

So, you do not plan to update 'asearch', 'asession', and 'aread' such that local repositories can be specified directly? I was wondering if you could provide this feature for these functions?

Thanks for your help in advance.

@MarcinKosinski
Copy link
Collaborator

@pbiecek do you think we can work on this?

2016-10-18 16:14 GMT+02:00 reichek notifications@github.com:

So, you do not plan to update asearch such that local repositories can be
specified directly? I was wondering if you could provide this feature for
asearch because for most of the archivist functions a direct definition
of a local repo is enabled.

Thanks for your help in advance.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#298 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGdazhMX5kkZP1ihAksBJtX6ZebJQaXaks5q1NRRgaJpZM4KZuOW
.

@pbiecek
Copy link
Owner

pbiecek commented Oct 18, 2016

Pls correct me if I've missed some detail,
But asearch is just a wrapper over searchLocalRepo just with a shorter name and limited options.

And in the searchLocalRepo function you can specify the repoDir for each call.

So maybe it's enough to use searchLocalRepo instead of asearch?

Dnia 18.10.2016 o godz. 16:58 Marcin Kosiński notifications@github.com napisał(a):

asearch

@MarcinKosinski
Copy link
Collaborator

asearch is a wrapper around searchInLocalRepo and loadFromLocalRepo (so it also loads objects) but this is not the point.

the point is that asearch is a standard used to provide hooks under results in reports (after archive or after addHooksToPrint

@reichek
Copy link
Author

reichek commented Oct 19, 2016

Thanks for your valuable comments!
Since I am new to archivist I might not be aware of all features of your
great package. The problem I have is that I would like to use a repo for
input data (generated in script A) and a repo for results (generated in
script B). This gives us the required freedom to ensure version control
over input data, while different scripts use this input data to compute
different analyses. Both data sets (input and output data) must be stored
in local repositories. So, if all functions of archivist would enable the
specification of the local repository which will be used, we won't have the
problem to search, read or write into a wrong repo (just because it is set
to the current default repo). Is this in general possible to implement or
would it require substantial rewriting of archivist?

Thanks again for your help,
Kristin

2016-10-19 12:21 GMT+02:00 Marcin Kosiński notifications@github.com:

asearch is a wrapper around searchInLocalRepo and loadFromLocalRepo (so
it also loads objects) but this is not the point.

the point is that asearch is a standard used to provide hooks under
results in reports (after archive or after addHooksToPrint


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#298 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AEl6zi_occk6HCQdiWqmmIuxu0lx4E1-ks5q1e84gaJpZM4KZuOW
.

@MarcinKosinski
Copy link
Collaborator

So I would suggest using searchInLocalRepo+loadFromLocalRepo for local
repositories right now.
asearch is only a wrapper and is not so powerfull like
searchInLocalRepo+loadFromLocalRepo which can take repoDir as a parameter.

2016-10-19 15:27 GMT+02:00 reichek notifications@github.com:

Thanks for your valuable comments!
Since I am new to archivist I might not be aware of all features of your
great package. The problem I have is that I would like to use a repo for
input data (generated in script A) and a repo for results (generated in
script B). This gives us the required freedom to ensure version control
over input data, while different scripts use this input data to compute
different analyses. Both data sets (input and output data) must be stored
in local repositories. So, if all functions of archivist would enable the
specification of the local repository which will be used, we won't have the
problem to search, read or write into a wrong repo (just because it is set
to the current default repo). Is this in general possible to implement or
would it require substantial rewriting of archivist?

Thanks again for your help,
Kristin

2016-10-19 12:21 GMT+02:00 Marcin Kosiński notifications@github.com:

asearch is a wrapper around searchInLocalRepo and loadFromLocalRepo (so
it also loads objects) but this is not the point.

the point is that asearch is a standard used to provide hooks under
results in reports (after archive or after addHooksToPrint


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#298 (comment)
,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AEl6zi_
occk6HCQdiWqmmIuxu0lx4E1-ks5q1e84gaJpZM4KZuOW>
.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#298 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AGdazv46B52wzOraxAT6HF57hKYCOxUTks5q1hqtgaJpZM4KZuOW
.

@pbiecek
Copy link
Owner

pbiecek commented Jun 23, 2017

@MarcinKosinski Are you planning any implementation, or updates in the documentation related to this issue or we shall close it?

@MarcinKosinski
Copy link
Collaborator

This issue isn't resolved so Perhaps shouldn't be closed. I never intended to provide functionality that can somehow improve this approach. It's better to leave this open, as someone someday may provide a PR for this.

@MarcinKosinski
Copy link
Collaborator

So in the end this is not an issue, but a question with some propositions from our perspective. I'd close this as it's not an issue that can be fixed but a specified behavior and feature.

@pbiecek
Copy link
Owner

pbiecek commented Nov 24, 2017

Actually, I think that the issue should be addressed somehow.
The dirty solution is to create functions asearchLocal() and areadLocal() which will work as asearch() and aread() for local repos.
This will basically just a wrapper over loadFromRemoteRepo and multiSearchInLocalRepo, but with nicer name and nicer defaults.

pbiecek added a commit that referenced this issue Nov 24, 2017
@pbiecek
Copy link
Owner

pbiecek commented Nov 24, 2017

candidate fix in aa49767
new functions areadLocal and asearchLocal allow to specify the local directory directly

@pbiecek pbiecek closed this as completed Nov 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants