Skip to content

Analyse disciplinary differences of software mentions across different large scale software mention datasets

License

Notifications You must be signed in to change notification settings

f-krueger/SoftwareImpactHackathon2023_DisciplinaryDifferences

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project name: Exploring Disciplinary Differences in Software Mentions

project-banner

Project description

Project slides

This project was part of the Chan Zuckerberg Initiative on "Mapping the Impact of Research Software in Science". In this project, we are interested in studying the following questions:

  • What is the distribution of publications mentioning (or not) software across disciplines?
  • How is different software used by researchers across their publications?
  • What is the ‘proximity’ of scientific publications to the use of software? (ongoing)

Methodology

We conduct scientometric analysis of publications mentioning software to match software mentions with papers, authors, and disciplines.

Datasets

Software/Tools

  • Google BigQuery (InSySPo project - Brazil)
  • Databricks
  • VOSviewer
  • R
  • Python

Data collection

Match CZI software mentions and SoftwareKG mentions with OpenAlex publications (DOI, PMCID)

Software name disambiguation in CZI dataset

There were software names in the CZI dataset that were not disambiguated. We used fuzzy matching to identify the "similar" software names to merge them before plotting our networks.

Findings

Top softwares per discipline

top softwares per discipline

Software mentions per discipline across time

software mentions across disciplines across time

Software mention networks

Using the CZI dataset (1.7 million publications)

software network mentions in CZI dataset

Using the KG dataset

software network mentions in KG dataset

Software network differences across contrasting disciplines

software mention networks comparison

Future work

Software dependency per domain

future1

Software dependency domain comparison

future2

Contributers

  • Alexy Khrabrov
  • Frank Krüger
  • Fuqi Xu
  • Huimin Xu
  • Puyu Yang
  • Rodrigo Costas
  • Shahan Ali Memon

About this project

This repository was developed as part of the Mapping the Impact of Research Software in Science hackathon hosted by the Chan Zuckerberg Initiative (CZI). By participating in this hackathon, owners of this repository acknowledge the following:

  1. The code for this project is hosted by the project contributors in a repository created from a template generated by CZI. The purpose of this template is to help ensure that repositories adhere to the hackathon’s project naming conventions and licensing recommendations. CZI does not claim any ownership or intellectual property on the outputs of the hackathon. This repository allows the contributing teams to maintain ownership of code after the project, and indicates that the code produced is not a CZI product, and CZI does not assume responsibility for assuring the legality, usability, safety, or security of the code produced.
  2. This project is published under a MIT license.

Code of Conduct

Contributions to this project are subject to CZI’s Contributor Covenant code of conduct. By participating, contributors are expected to uphold this code of conduct.

Reporting Security Issues

If you believe you have found a security issue, please responsibly disclose by contacting the repository owner via the ‘security’ tab above.

About

Analyse disciplinary differences of software mentions across different large scale software mention datasets

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •