Skip to content

Proposal: Full Musnad Ahmad Integration (Darussalam Edition) via ETL Pipeline #5

@mahmoudkalimero1100-rgb

Description

@mahmoudkalimero1100-rgb

Hello Sunnah.com team,

Thank you for your amazing work on this platform. I noticed that the Musnad Ahmad collection is currently only around 4% complete.

As a software engineer, I wanted to help accelerate this process. I have managed to extract the complete Musnad Ahmad dataset (Darussalam numbering, +27,000 hadiths) and wrote an ETL script to format it to closely match your data schema (converting raw text into structured JSON with collection, bookNumber, chapterTitleArabic, hadithNumber, etc.).

I have divided the massive dataset into smaller JSON files by Book/Chapter to make it manageable. You can review my proof of concept, the raw data, and the formatted output on my repository here:
https://github.com/mahmoudkalimero1100-rgb/musnad-ahmad-json

I would love to contribute this data. Before submitting any Pull Requests, could you let me know:

Does the JSON structure in my /data/formatted/ folder align with your ingestion requirements?

Would you prefer me to submit incremental PRs book by book to facilitate your code review?

Looking forward to your feedback and to helping complete this collection!

Best regards,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions