scikit-project: A Guide to Building Your Open-Source Science Project
A cheatsheet to develop a scientific open-source library from scratch.
Zero to Library
Here you will find information to design, build, and release an open-source library to perform scientific research in Python from scratch to finish.
0 - Open Source for Open Science: Some information about the Python and open source ecosystem, and how they relate to open science are also given.
1 - Before Starting Coding: Setting up the working environment on your machine, including the tools you will need to write code efficiently.
- Sublime, Git, GitHub
2 - Developing your Project: A step-by-step guide with best practices for coding, and tips for making code development as effortless as possible.
- PEP 8, PEP 257
- jupyter notebook
3 - Testing: Especially in software related to scientific research, at start, the destination is not always crystal clear. Code is written, optimized, reorganized. Unit testing is a crucial task to avoid getting lost in the process.
- nose2, pytest
- Travis CI
4 - Packaging: Build your package to import your library like any other library.
5 - Documentation: "Code is more often read than written" said Guido Van Rossum, the creator of Python. This is especially true for collaborative projects in the scientific research space, in which researchers might interact with code in a delocalized manner, at different times, and on different time scales. Making good comments of features, from functions to classes, allows one to generate with little effort a full fledged online documentation, drastically increasing the rate of adoption.
- Read the Docs
6 - Distributing: It's time to release your library. Making importing your library easy is a key feature for its users, potential future contributors. Several tools now allow to deploy code in a variety of environments using package managers.
7 - Publishing: As a researcher, you are asked to publish your research results. In the case of a code project these can be novel research results derived from the software or the software itself, if it provides an innovative method or approach that can help the research community. Some examples on best practices to publish code and datasets are given, such as considering Zenodo for file hosting and an immediate DOI reference. A list of research journals especially relevant for physicists is provided.
8 - Hosting: Interactive notebooks allow users to play with your code effortlessly, without worrying about installing software. They are also a great solution for interactive workshop and in the classroom, as they allow students and in general users to tinker with code without any installation.
- My Binder
9 - Useful Links: A collection of useful links is provided.
This library is thought to help you in two ways:
The files linked above in the table of contents provide a guide that you can navigate to learn more.
The folders present in this repository have all the contents to build an example of
mylibraryand deploy it. You can then modify the contents according to your project. The
mylibrary/mylibrary: where the code is located;
mylibrary/mylibrary-notebooks: where the
.pynbnotebooks are kept.
Other folders will be automatically generated by the
setup.py file and by creating the documentation.
This project stems from a series of talks, lectures, and other ideas developed in relation to the QuTiP (Quantum Toolbox in Python) library and open source in quantum-tech research. The aim is to provide useful information in a concentric way, starting from the quantum-tech research community, to the physics, academic and in general scientific community engaging with open-source software.
"Open-source scientific computing for quantum technology: QuTiP", Nathan Shammah RIKEN Berkeley workshop, Berkeley, USA, 2019
"The rise of open source in quantum physics research", Nathan Shammah and Shahnawaz Ahmed, Nature's physics blog, January 9, 2019