Join GitHub today
Moving spatial related R packages to r-spatial organization? #11
As @tim-salabim said yesterday, he wants to move
To promote this organization and to have spatial related R packages at one place, we could think over moving more spatial related R packages here?
(feel free to suggest more, this was just a brief start of packages which came to my mind)
I appreciate the idea to have a starting point or home base for the spatial focused R stuff. It seems to be convenient to find all up to date releases at one place.
In my opinion this is not covered by an more or less loose collection of "spatial" package itself. It is more a specific way to think and solve problems.
So in addition I would also be very interested in developing and providing a kind of open course or training system linking this spatial packages and concepts while addressing real world questions.
There are a lot of good reasons to use R as scripting language - the main reason for me is the low entrance level for students. Nevertheless and probably because my roots are in remote sensing and modeling community I often need to deal with data that is seems to be far beyond the typical R scope. . Therefore I am also interested in developing packages that simplify the usage of all this excellent and mature GIS and CLI tools outside of R.
For both goals I highly appreciate the idea to have a kind of common home. I think r-spatial would be a great place to start.
From my side I would be happy to move the link2GI package which supports the easy integration and partly wrapping of rgrass7 SAGA GIS OTB and some other big ones.
Which packages should be integrated?
Sure, this is a major point which needs to be discussed. Too many "small" packages would make the organization messy so I would suggest to include only "major" ones for a start.
Everybody active here has a coarse feeling what the major r-spatial packages are. If a new one is coming up with a large user base, we could invite the owner to move to r-spatial. I do not think there needs to be a hard threshold using monthly download numbers or something similar.
https://cran.r-project.org/web/views/Spatial.html gives a good starting base for spatial related R packages.
Who should make these decisions?
I would say everybody can suggest invitations to r-spatial but only few selected guys should then decide whether its okay or not. Having a voting of all members would take way too long? In particular I refer to @rsbivand, @edzer + 1 or two more? Of course these guys need to be active here so that we do not face issues related to late responses or similar. These guys would then be the "admins" or how it is called.
@pat-s not at all ;-)
If you take for instance GRASS7x, each installer and each Windows version will claims for slightly different pathes, settings and so on. Roger's rgrass7 is great but does not really cover this. The user has to organize this manually and it is only possible if there is a lot of system knowledge and admin rights. Even worse with SAGA GIS and RSAGA. The developer of SAGA are permanently changing their API calls as a result RSAGA just support the 2.04-2.27 Versions... Endless story.
To make it short the package provide something like
I do not think that is has to be hosted at r-spatial because I am sure that it is only of interest for only a small number of people messing around with a lot of API and CLI calls. Nevertheless I will update it soon on CRAN and if it is a bit more mature it could be of interest for the r-spatial community.
@pat-s just some additional clarification notes.
However keeping in mind that QGIS concept of integrating tons of external tools already provides pretty much GIS/RS/modeling stuff it is (1) still a subset of the capabilities of the contributing software packages
I think it is error-prone, bit insular and highly inefficient to avoid the full contribution of all this well designed and mature spatial software stuff.
To make it available for a wider R community we should take some efforts and I think r-spatial would be a good place for doing this.
To organize ourselves, I think we need to do more substantial work than moving repositories.
Although I have no objections against doing so, I've mentioned earlier that I don't know good reasons to move repositories here, except for the hope that someone will find them here. But once we have more than 30 repos, will people take the effort of going through the whole list? Nothing is easier than moving a repo, but will it be found more often? @pat-s : your link points out difference between an orga and a user account, but please tell me what, for us, the real reason is to move all things here.
I think that, as long as we don't have anything better, the primary point for finding spatial packages will be the CRAN spatial task view. From there, you find CRAN packages. I see CRAN as the place for packages of which the developer thinks they're useful and mature enough to not only be used but also to relied upon by others, for their work. From the CRAN package, if the pkg is on GH, you'll find the GH link.
Finding packages that are potentially of use, but not on CRAN, such as several of @bhaskarvk or @mdsumner packages, would be helped by having an index (similar to task view) for these packages here. Writing such an index is valuable work, and much more effective for the purpose of packages being found (and then used) than moving repos here.
Moving a repo here means that you'll have to trust the orga admins, currently @tim-salabim and me, being an admin over it. It separates the package somewhat from the primary author, which may not be what the author wants.
The comparison to the rstudio and ropensci organisations is not about fame: both reflect legal bodies with substantial resources that is being used for package maintenance and development, and that may collect copyrights. We are not such an organisation.
Finally, the maintainers of rgeos, rgdal, spdep, maptools, raster and geosphere (and many other packages) don't use github for their code development.
From my point of view, this is not a sensible use of time. Packages hosted on R-Forge under SVN may be hard to follow for those with insufficient experience, but the substantial effort of rebasing them on github (which I sincerely dislike) will not add any functionality. Note that at the point at which some government blocks github, people in places we care about will lose access to source code etc. Not choosing github is not a definition of un-coolness; it may be simply history, and github's day will also pass - at least we need to be aware that it may.
As you note, the task view is there, has been there since forever (also the Spatio-temporal task view); it would be much more helpful to join/assist efforts to help the whole task view infrastructure to scale. So is the mailing list, which is actually where most real interaction with users occurs.
It is important not to gate-keep, and not to curate (certainly not heavy-handedly). Users needs differ enormously, also over time, and in an ecology the packages which fit purposes (that may often not be known to the authors) get used. There are lots opportunities for boosting around, many of them actually lead users to suboptimal choices, and without direct contact it is very hard to guess what may be helpful. Curating assumes intelligent design, which will lead to bitrot with very high probability.
It is true that you will find everything at the CRAN spatial task view. But it is also true that it is cumbersome to do so. Especially for somebody who is not used to this specific R world. To me it seems like surfing along a lot of similar packages and description and it is often hard to understand why to use this or that or even something else and it is almost impossible to differentiate which approach and package is appropriate...
@edzer I am not quite sure if providing an list with useful packages will help a lot. I am not sure but maybe it would help more to review such efforts and give brief and clear hands on examples how to use them. Perhaps even in a blogroll or something similar.
Somehow I am still convinced that it would be very helpful for us and a lot of the users to find a structure that bundles the available knowledge about spatial R stuff in a more effective and transparent way than the CRAN spatial task view, lists or the daily stackoverflow searches does.
@gisma Okay I see! That sounds nice and if its a generic framework it could be very valuable for the R community. You should also contact the authors of the packages for which this wrapper applies - so that they mention your package on their repo, preferably with a use case scenario to make things easier.
Definitely - the integration of selected packages in r-spatial was just a starting point. Follow-up work would involve a lot of organizational/structural work.
This is fore sure one of the main reasons - to have one central place (repo) for the most searched packages. This also simplifies
I would say in the long run, yes. If r-spatial has been well acknowledged as the place for r-spatial packages, users would profit from it.
The link was just to provide a quick starting point for orga discussions. The main points which come to my mind would be
Sure, this depends on the authors. I mean its no need, just an offer. I guess there is no doubt that you both are totally trustworthy and nobody would question that. In fact I would say that without the participation/support of @edzer and @rsbivand (as maybe the two most known r-spatial persons) this repo is somehow a not really serious project. In the long run there is not only a need for 2-3 admins but also the need to distribute work among all (since everybody is busy with other stuff in work than programming packages).
I do not want to raise a legal body with r-spatial. rstudio and ropensci have different aims. R-spatial could just be a central place of r-spatial packages as a starting point for searches & further development (discuss repo, r-packages themselves, hosting or r-spatial.org).
(breaking this comment here)
Could you elaborate more on this? Curious to reflect on your points on this topic.
I do not fully understand this point. Even if Githubs time will pass and governments will block it or there will be another site being # 1 for package development, the code still exists locally (as it does for packages not hosted on Github)?
The task view will never become obsolete. There will always just be a limited number of r-spatial related packages in this orga (speaking fictional now) and it will never cover any relations between the packages or be subdivided in sections like the task view does it.
Asking the other way round: What is the benefit not hosting an R-package on Github (or Bitbucket etc.) not addressing orgas in particular but hosting in general? Doing so opens the development process for the public, provides a place to open issues and install development versions of packages (if desired). This all is not possible if one keeps their development private.
I support this point. However, I want to express that the task view and SO searches have different aims than a Github repo/orga. The task views gives logical links between packages and a wide overview for sub-fields of "spatial".
I don't actually do much GIS within R, although I'm happy that you've included
In terms of selecting packages, I would argue that packages that simply provide functionality (like
I think it would be good to have an equivalent of the tidyverse for easy install and use of packages that use sf and stars. Check out what I have to say on the matter here: Robinlovelace/geocompr@c31ad20
Please people (especially @edzer and @tim-salabim ) let @Nowosad and I know if any of this is wrong or becomes out-of-date (trying to write about r-spatial in a future-proof way is not easy since sf - but it's much more fun!).
In OSGeo, there is a much more structured approach to incubation and candidate projects. Nobody there starts with the other end, though. Arguably, things here are much more fine-grained than in OSGeo. For example, in teaching in Poznan using sf, we found that geom_sf was hard to use, but tmap suited the students very well, and rendered faster (subjective impression). I think this is because tmap uses grid directly and 'thinks' cartographically - direct support for classInt - rather than ggplot2's assumptions about aesthetics and having to go through ggplot to get to grid. So what should one recommend now, without running usage studies ... I think that usage RCTs might help to sharpen focus on things like this (we didn't do it, but only had base/lattice to consider).
Just to clarify: RCT refers to randomized controlled trials I believe. Expanding the acronym as it's the first time it's been used in this thread.
I hope that encouraging people to use sf will be beneficial for its development. I will add an issue to add a section on 'contributing to the community' now.
In writing the book I hope we can have a positive impact on open source software development and will continue to log these here: https://github.com/Robinlovelace/geocompr/blob/master/our_impact.md
Hoping to increase the PR:bug ratio in there will also be good but to me that document shows that there can be benefits associated with communicating about things as they develop.
The deadline for the book is November 2018 so some time for things to change/settle and great to have an idea of what may or may not be in the pipeline based on open discussions like this so many thanks for that.