Skip to content

[MAINT] Saket Choudhary @ pysradb #101

@saketkc

Description

@saketkc

🧑 Your Details

Username: sakekc

Full Name: Saket Choudhary

Photo URL: https://avatars.githubusercontent.com/u/682153?v=4

Designation / Role: Assistant Professor, Koita Centre for Digital Health, IIT Bombay

Social Profiles:
Github: https://github.com/saketkc
Linkedin: http://linkedin.com/in/saket-choudhary/
Substack: https://substack.com/@genomeofindia
Blue Sky: https://bsky.app/profile/saketkc.bsky.social
X/Twitter: http://x.com/saketkc


🧪 Projects

Project 1

Name: pysradb

Project Link: http://github.com/saketkc/pysradb

Website Link: http://saket-choudhary.me/pysradb

Logo URL: https://saket-choudhary.me/pysradb/_static/pysradb_v3.png

Short Description: pysradb is a python package to fetch metadata associated with genome sequencing data deposited in the Sequence Read Archive (SRA) database at NCBI or European Nucleotide Archive (ENA).

Full Description: The NCBI Sequence Read Archive (SRA) is the primary archive of next-generation sequencing datasets. SRA makes metadata and raw sequencing data available to the research community to encourage reproducibility and to provide avenues for testing novel hypotheses on publicly available data. However, methods to programmatically access this data are limited. Pysradb provides a collection of command line methods and python API to query and download metadata and data from SRA.



📝 Fun Maintainer Questions

1. How to support
pysradb repository has a set of open issues that we need help with. Anyone unfamiliar with the general world of bioinformatics or genomics can also contribute. The crux of pysradb operations happens through interaction with etuils API, so as long as you are comfortable working with APIs you can contribute to pysradb! Any contributions are welcome - PRs, issues, documentation fixes!

2. A small brief about your project
Biological researchers worldwide generate petabytes of genomic sequencing data, but accessing it is a nightmare. Scientists spend weeks navigating the maze of NCBI's SRA, ENA, and GEO databases just to find and download the datasets they need. The identifiers are cryptic (SRP? GSE? SRR?), the APIs are complex, and downloading terabytes of data often fails midway or is incomplete without the associated metadata! Pysradb democratizes access to the world's largest repository of sequencing information.

3. One FOSS maintainer lesson for your younger self
When I started, I just thought I need to be the best at coding to be a good maintainer. A good maintainer is not necessarily a good coder, but an all rounder - listens, provides feedback, documents and keeps the community together rather than forcing their ideology over everyone.

4. Why do you do it? Why do you bother maintaining a FOSS project?

I maintain FOSS project to reduce the barrier that researchers like me face

5. If your repo had a theme song, what would it be?

It has to be "Let me speak" by Indian Ocean: https://youtu.be/4NbXG9i8uFg. It is one song that I listened to in loop when I coded the first version of pysradb.

6. Which file in your project would you most like to set on fire?

It has to be sradb.py. The name of the repo was inspired from a now defunct package and this file is the legacy I believe pysradb never needed.

7. What's your open-source villain origin story?

I got exposed to open source during my internship at SlideShare. SlideShare team was open source aficionados. That is where I learned and used Ruby on Rails and web development which lef me to GSoC 2012 and contributing a slide importer for Connexions. This was my entry point for GSoC 2013 and GSoC 2014 and I just got hooked with all the cool stuff out there in open.

8. If you had to use one emoji to convey what it's like to be a FOSS maintainer, what would it be?
🧠 - Be curious, listen, empathise and work hard!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions