Source files, scripts and data imported to Sefaria.
HTML Groff Python C# Perl JavaScript


This repo contains source data, parsing scripts, data files and logs for data projects going into Sefaria. This is the messy input that is processed to become the Sefaria Library.

If you're looking to download Sefaria's texts or links, please see Sefaria-Export. Exported data has a uniform structure.

For Sefaria source code see Sefaria-Project.


  • book structures - scripts to create schemas for new books
  • sources/ - original digital files that were manipulated to produce our data, along with scripts used in parsing.
  • Match Logs - logs from commentary/text matching scripts
  • misc/ - misc small data files about texts