This repo contains source data, parsing scripts, data files and logs for data projects going into Sefaria. This is the messy input that is processed to become the Sefaria Library.
If you're looking to download Sefaria's texts or links, please see Sefaria-Export. Exported data has a uniform structure.
For Sefaria source code see Sefaria-Project.
book structures- scripts to create schemas for new books
sources/- original digital files that were manipulated to produce our data, along with scripts used in parsing.
Match Logs- logs from commentary/text matching scripts
misc/- misc small data files about texts