backlog

mr-c · Apr 12, 2016 · b814ca8 · b814ca8
1 parent 6801e51
commit b814ca8
Show file tree

Hide file tree

Showing 4 changed files with 173 additions and 1 deletion.
diff --git a/about.rst b/about.rst
@@ -9,6 +9,6 @@ CWL Community Engineer.
 
 .. raw:: html
 
- <a href="https://impactstory.org/MichaelRCrusoe"><img src="https://impactstory.org/logo/small" width="200" /></a>
+ <a href="https://impactstory.org/u/0000-0002-2961-9670"><img src="https://impactstory.org/logo/small" width="200" /></a>
 
 
diff --git a/community-engineer-update.rst b/community-engineer-update.rst
@@ -0,0 +1,74 @@
+.. post:: 2016-04-08
+   :tags: update
+   :author: me
+   :location: Brussels, Belgium
+
+***************************************
+CWL Community Engineer Six Month Update
+***************************************
+
+Hello everyone. It has been a while since my last update. Here is what has been
+happening in the CWL world since I began full time work for the project.
+
+Draft 3 was released and the community has committed to releasing 1.0 of the
+standards before ISMB. I will be the release driver.
+
+I met with potential fiscal sponsor, the Software Freedom Conservancy, and
+submitted application for the CWL project to become part of their 501(c)(3).
+
+I've been taking full advantage of my base in Europe (first Romania and now
+Belgium) to raise and enhanced the perception of CWL in the European life science
+computing community. I presented at two leading centers (SciLifeLab in
+Stockholm, Flanders ExaScience Lab in Belgium) and I participated in four
+ELIXIR sponsored hackathons: Amsterdam, Netherlands; Freiburg, Germany;
+Copenhagen, Denmark; and Trondheim, Norway.
+
+A CWL subgroup of academic cluster users are figuring out what changes are
+needed to support non-containerized tool execution. Non-cloud support
+for CWL will be critical for wider adoption. I have enjoyed coordinating this
+group and I was able to host a visiting Australian graduate student (Kevin
+Murray) who has gotten Docker containers to work on older platforms without
+needing to upgrade them or use ``root`` privileges.
+
+The CWL is spreading into the wider F/OSS tech world thanks to a partnership
+with the Debian-Med community, the leading community packagers of bioinformatic
+tools and workflows. In support of this partnership I applied for and received
+official status within that community (“Debian Maintainer”) and I have an
+application for full status (“Debian Developer”) in progress.
+
+Roman Valls Guimera and I have started a sub project to automatically produce
+CWL descriptions for those Python tools who use Python’s standard argument
+parser. This is now a Google Summer of Code project that will hopefully
+get their support for a student to work on over the summer.
+
+Speaking of GSoC, I agreed to co-mentor (with Stian Soiland-Reyes) another
+student's project to add CWL support to the Apache Taverna project.
+
+New Implementations & SDKs:
+Paul Gross's Java re-implementation has already found a couple issue with the
+specifications and fixes have been incorporated. 
+
+Sketched out plan for using Peter Amstutz’s “schema salad” tool to
+auto-generate code for representing the CWL object model in as many different
+languages as we care for. This is a critical first step to having autogenerated
+SDKs in multiple languages.
+
+Reference implementation improvements:
+Finished review of Peter Amstutz’s ‘cwltool’ and ‘schema salad’. I am maturing
+his work by adding Python 3 compatibility, type checking, code cleanups, and
+documentation.
+
+Other CWL impacts:
+Sent letter of support for Dr. Bernhard Renard, Robert Koch Institute
+(Germany), and his “Collaborative Benchmarking of Bioinformatics Tools and
+Workflows (CoBe)” project which uses CWL as a core technology.
+
+GA4GH container registry API project: CWL a key component and seen as a leader
+on the metadata issue; many CWL community members participate in their weekly call
+
+Logo acquired, Twitter account created, domain name purchased.
+
+Continuous testing of CWL implementations. Peter Amstutz and I have setup
+https://ci.commonwl.org to testing the conformance of CWL implementations on a
+continuous basis.
+
diff --git a/cwl-paris-hackathon.rst b/cwl-paris-hackathon.rst
@@ -0,0 +1,16 @@
+.. post:: 2016-03-21
+   :tags: events
+   :author: me
+   :location: Paris, France
+
+******************************************************
+Technical Hackathon : Tools, Workflows and Workbenches
+******************************************************
+
+A hackathon bringing together developers from the ELIXIR Tools & Data Services
+Registry, Galaxy, Taverna, Arvados, CWL, ReGaTE and EDAM ontology, with Galaxy
+instance providers from ELIXIR and beyond, to promote collaboration and
+technical developments will take place on 18-20 May 2016 at the Institut
+Pasteur in Paris. [Further details to
+follow](https://www.elixir-europe.org/events/technical-hackathon-tools-workflows-and-workbenches).
+
diff --git a/tacc-201511.rst b/tacc-201511.rst
@@ -0,0 +1,82 @@
+.. post:: 
+   :tags: weekly-update
+   :author: me
+   :location: Austin
+
+********************************************
+Summary of TACC Life Science Computing visit
+********************************************
+
+Met with John Fonner, Joe Stubbs, Matt Vaghen, Rion Dooley, Victor Eijkhout
+
+Very positive about CWL; they have agreed to become a CWL partner; effort will
+come from TACC sources; possibly also IPlant. They are also working on an app
+directory.
+
+Existing capabilities:
+
+From the CWL perspective, `The Agave Platform <http://agaveapi.co/>`__ is a
+multi-tenent, multi-execution-environment remote job runner. The primary use
+case is submission of jobs via a command line tool; specification of tool
+options is done via a JSON formated plain text file.
+
+Their workflow manager is called `endofday`. It started as a nextflow based
+docker orchestration program; it is now a pydoit based Docker & Agave
+application orchestration program. For long-running analysis steps; not (web)
+services.
+
+Areas of concern:
+
+How and where to make link between a generic CWL tool description & a
+particular tenant? This will be a concern for other platforms that don't use
+Docker, such as Galaxy.
+
+Their asks:
+[1] site specific config
+[2] Python & Java SDKs/libraries autogenerated from the spec for parsing CWL
+files.
+[3] document how to run the test suite by hand
+[4] best practices document: imports at top; IDs defined explicitly for each
+tool.
+[5] reduce syntax verbosity via implicit namespacing. [Does Draft 3 satisfy
+that?]
+
+
+Follow up: John Fonner & others to present Agave & their workflow system to the
+CWL group during the December 1st video chat. They will meet privately after
+that to organize; MRC to follow up on Dec
+
+A lot of the discussion was about the collaboration model between the larger
+CWL community and specific implementations: how will tool and workflow
+descriptions be shared?
+
+For implementations not using Docker: one collaboration model is to fork each
+tool description as that tool is installed: adding implementation specific
+fields to indicate which tenant the tool is installed to and other required
+details. In the case of a tool being installed multiple times the tool ID would
+be changed to allow for unique references from workflows. In this model
+workflows from outside sources would also be customized to refer to these
+platform-specific tools.
+
+Concerns about the portability of such workflows outside the implementations
+that produced them were raised. 
+
+Another proposal was to add another stanza to the job document (along with the
+already approved for Draft 3 identifier of which workflow or tool to run).
+
+However this could get quite unwieldy for users, especially for complex
+workflows with many steps & applications.
+
+While this information could be added on a per-tool basis to the CLI interface
+description document it would require changing the tool IDs from the community
+maintained copies thus breaking portability of workflows that reference such
+tools.
+
+Misc questions:
+
+How to mark input as required / optional? (Is this the `type: [null, ...]`
+trick?
+Would like to be able to feed output document back in as new input document to
+reproduce/re-do analysis automatically. Great idea, easily doable by adding the
+input document to the output object and updating the spec to specify that the
+output stanza (if any) should be ignored on input objects.