Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time
title author date ORCID
newspapers_al-quds: read me
Till Grallert
2018-03-28 10:57:56 +0300

GitHub release DOI

This repository contains bibliographic metadata for the newspaper al-Quds published by Jirjī Ḥabīb Ḥanāniyā in Jerusalem between 1908 and 1914. The Center for Palestine Studies at Columbia University scanned issues 1 to 391 and put them online. Currently these issues can only be accessed through their issue number and nested sub pages. I therefore produced machine-actionable bibliographic metadata including volume and issue numbers, as well as dates in all three calendars mentioned in the paper's masthead.

NOTE: as of late 2021 the facsimiles can no longer be reached. they were originally hosted on a Google Drive and all links are broken.

some technical details

This repository contains a single TEI XML file containing one <biblStruct> for each issue. This file is produced through automatic iteration making use of this code and manual validation against the digital facsimiles.

The TEI is then automatically converted to MODS XML for integration into reference management software etc (such as Zotero).

notes on the digital facsimiles

Since the publication schedule of al-Quds was rather irregular, I had to check a large number of facsimiles for their publication dates in order to adjust the input parameters for the algorithm generating the metadata. Doing so I came across a large number of missing issues, sub-pages that display only "Hello world", and incomplete scans. I have listed these errors below. Note that the list of files with missing pages will inadvertandly grow since I have not gone through individual issues (and might never do).

  • errors:
    • Missing scans (some of these pages show "Hello world"):
      • The purported scan of #142 is indeed a duplicate of #141
      • No file displayed for #168
      • #170
      • #254
      • #373
      • #377
      • #265 has not been scanned
      • #345 has not been scanned
      • #360 has not been scanned
      • #372 has not been scanned
    • Cut-off scans with illegible columns:
    • Missing pages:
      • page 4 is missing from #154
      • page 3 is missing from #224
      • page 3 is missing from #336
    • URLs with different patterns:


Bibliographic metadata as TEI and MODS xml for Jirjī Ḥabīb Ḥanāniyā's newspaper al-Quds (القدس) from Jerusalem, 1908-1914







No packages published