Binder - notes from provenance/legal/authorship breakout
Gail, Carol, Jon Udell, Christie Bahlai
Basic questions --
- how do we do provenance and tracking when remixing/synthesizing/meta-analyzing, specifically in binder?
- what kind of metadata do we track?
- what legal interoperability issues arise from this remixing, redistribution, and public display of content from disparate sources?
e.g. in Tim's bike binder, there were URLs that were being downloaded.
Gail: "even the mosaic is protectable", e.g. selection and arrangement of components is itself protectable! A notebook is a bunch of these "tiles".
For legal interoperability you want all of the rights and permissions to be explicit and actionable in automated way (reference CODATA-RDA Legal Interop Priciples & Guidelines, 2016).
Gail: I was amazed that there was no click through authorization or liability waiver when forking/executing notebooks of uncertain/unknown provenance for display and reuse in a third party public platform!! That's risky for said platform and thus not sustainable!?
Jon talks about discoverability of shared bits - quotes Maryanne Martone about finding shared ingredients in recipes etc.
Gail: you need fixity: things need to be arranged around a "final" object (in the sense of formally versioned: think the 2nd edition of a book with a different ISBN than the first ed), although it may be remixed etc. and later versions may emerge. Functional requirements: has to be persistent, has to be well defined, has to be stand-uppable/renderable. External data must be persistent and explicitly referenced as well
One partial solution: Zenodo DOI generated from self-contained github repo, with data sets that have DOIs.(GC note: I'm not certain that all the tiles in the notebook mosaic must have a DOI as their digital persistent id but they do all need digital persistent ids of some sort. Benefit of DOIs is that they are well established, trusted, metadata-dependent, and the two DOI agencies in the western scholarly sphere -- Crossref and Datacite- are A+ on service, support, innovation, and thoughtfulness)
Also: Open Science Framework registered foo. This could be a/the scholarly binder location. Click through license. S3/Google Compute/preprint server/thesis stuff. (GC note: institutional or disciplinary repositories/ data centers might also make good binder homes as long as they have a sustainable business model to remain perpetually open. For example, Caltech might host our notebook dissertations since they are institutional records and their long term stewardship is 100% our mission. // Also Titus I know a thoughtful open non profit publisher who is currently offering low cost hosting for OA journals and repositories who might be interested in piloting a hosted binder service).
(Titus desperately wants to talk to Gail about IPFS but will hold off for now.)