CD2H Resource Discovery Core
- Director: David Eichmann (University of Iowa)
- Co-Director: Kristi Holmes (Northwestern University)
- Community Co-Chair: Nick Anderson (UC Davis)
Other Core Members
Don't edit this - the RPPR generator populates this section
Year 3 Budget
Don't edit this - the RPPR generator populates this section
The mission of the CD2H Resource Discovery Core is to support discovery of the rich landscape of expertise and resources available within the CTSA hubs and beyond, and to provide a robust infrastructure to enable attribution of the many different essential contributions by the translational workforce.
We leverage effective strategies and inventive approaches to build connections within and beyond the CTSA Program. We have adapted and expanded our existing research profiling infrastructure using open collaborative approaches. We are developing tools to identify, track, and disseminate a wide range of research objects (e.g., software, data, informatics, and other non-traditional scholarly products and activities) to enable proper attribution of credit on translational teams and discovery of these objects at the local level and beyond. Finally, extended knowledge about expertise across the CTSAs is being applied to assist in the creation and success of community-wide collaborative functions. This mission is manifest in activities such as:
- Provisioning a CTSA-wide index of all resources utilized across the full spectrum of CD2H activities;
- Creation of expertise, visualizations, and services for adoption and assimilation by CTSA hubs for use in their local environments;
- Development and deployment of an open repository to support preservation, indexing, and discovery of resources at CTSA hubs;
- Development of a data discovery engine to index and search data across hubs (collaboration between Scripps & Northwestern);
- Harmonization of educational resources from multiple existing platforms into a single shared discovery framework;
- Extending representation of expertise and related services across the CTSA consortium;
- Developing a practical, scalable model of contribution and fostering its adoption across the CTSA Program.
CTSA hubs experience serious challenges in maintaining awareness of existing resources, resulting in significant duplication of effort and lost opportunities for synergy. Moreover, empowering hubs to collect, record, preserve, and disseminate a wide range of digital works across the translational workforce is critical to enhance their visibility, promote people and their expertise, support attribution of their work, aid the discovery and accessibility by the international scientific community, and support open and FAIR science. A range of key tools and resources spanning the spectrum of expertise, services, and resources holds great promise to change this.
We have developed an open, flexible architecture of resource identification, characterization, and discovery based open open tools to be extended with new capabilities as needs are identified.
Justification and Feasibility
While first identified as a core component of our proposal to create CD2H, we have received recurring comments regarding a need for such a capability from NCATS personnel, CTSA hub PIs, CTSA informatics directors, and the CTSA Consortium population in general. Feasibility has been demonstrated with our proof-of-concept faceted search interface (see below) as well as leveraging existing software and service architectures to achieve modernized, safe, and scalable tools. Finally, we have worked extensively with stakeholders to scope requirements and solicit feature requests, while engaging in ongoing data testing and code review to produce dependable resources for the community.
Summary of existing system and findings
See CD2H Labs for our latest demonstration platforms. These include:
- the CD2H project dashboard
- CD2H faceted search
- a prototype of a CTSA Consortium-specific web harvesting and extraction platform (used for service characterizations so far)
- a platform for harvesting and search of CTSA-relevant GitHub repositories, owners and contributors
- a VIVO-compatible platform integrating traditional research profiling data with GitHub information and connections to "grey" sources such as FigShare
The approach employed by the Iowa team to date has been one of rapid prototyping of a harvesting platform for a newly identified information source, integration of the resulting metadata into the SciTS warehouse and then enabling discovery of those metadata in our faceted search engine. We plan on continuing this approach as long as new information sources arise. In year 3, we will be enhancing this overall framework with multiple means (JDBC, GraphQL, SPARQL) for hubs to directly interrogate the environment and embed both services and information into their local environments.
The Northwestern team uses Agile development methodology to support progress on nearly all projects. Development is separated into two-week sprints. The sprint kicks off with a planning meeting where the team decides what work will be completed in the two-week increment and how the work will be achieved. Any dependencies or blockers are discussed at this time, as well. Teams have daily in-person or virtual stand-ups where reports of work are shared, as well as concerns or blockers so that course corrections can be made quickly.
As noted above, we have made significant progress in configuring SciTS as a central resource linking the various activities and tools in this core. This has been greatly informed by our engagement with the CTSA community - particularly in the areas of educational resources and hub services. We have multiple live services operational in CD2H Labs with frequent deliverables scheduled in the coming reporting period.
We have also made significant progress on the InvenioRDM, Personas, and Attribution projects. We released a containerized alpha version of InvenioRDM on our staging server that includes features such as basic deposits, powerful faceted search, and DOI minting, as well as a domain landscape analysis with our Repository & Index Software comparison. We made progress in the computable attribution work (CRedIT ontology & Contributor Role Ontology released, draft annotation file, research object assessment & mapping). We held a small Attribution F2F and are developing a demonstrator. For the Personas project, we completed extensive information assessment and literature gathering & assessment to identify key job families, finalized a template, and created a sample profile. We've also worked to welcome CTSA Program collaborators to the projects through orientation, with outreach at meetings and on calls, and other activities to build a community of practice. The Personas project wraps up prior to Year 3, with active dissemination of deliverables.
- Information Architecture (internal project) (RPPR)
- InvenioRDM Research Data Management Platform (RPPR)