-
Notifications
You must be signed in to change notification settings - Fork 0
Data Collection
This page explains how to collect sighting data, run the pipeline, and add new records.
Each record in the dataset represents one confirmed human sighting with:
| Field | Description |
|---|---|
| Date | The calendar date of the sighting (local date) |
| Location | Latitude, longitude, and elevation in metres |
| Observed time | The local time at which the sighting occurred |
| UTC offset | The hours offset from UTC at that date and location |
The pipeline converts each record into a solar depression angle by back-calculating the sun's position at the UTC moment of the sighting using PyEphem with atmospheric refraction.
Not included: calculated prayer times, angle guesses, or aggregate statistics. Only records where an actual human reported "I saw true dawn at this time on this date at this location."
# Python 3.10+
python -m venv .venv
source .venv/bin/activate # on Windows: .venv\Scripts\activate
pip install -r requirements.txtpython -m src.pipelineThis does three things in sequence:
-
Fetches the OpenFajr iCal feed from
calendar.google.com— ~4,018 community-verified Fajr records from Birmingham, UK, 2016-2026. Requires network access. -
Loads manually compiled records from
src/collect/verified_sightings.pyand per-source CSVs indata/raw/raw_sightings/. -
Loads pre-computed SQM angles from
src/collect/precomputed_angles.py(1,621 Basthoni 2022 records where depression angles were measured directly by instrument). -
Looks up missing elevations via the Open-Topo-Data API (with Open-Elevation fallback)
for any record where
elevation_m == 0.
Output:
data/processed/fajr_angles.csv — 48,668 Fajr records
data/processed/isha_angles.csv — 34,529 Isha records
python -m src.pipeline --no-elevation-lookupSkips the Open-Elevation API calls. Use this when:
- You're offline
- You want faster iteration while adding new records
- All records in
verified_sightings.pyalready have non-zero elevations
Loading OpenFajr Birmingham iCal feed...
4018 Fajr records from OpenFajr
Loading manually verified sightings...
... genuine manually compiled records (after quality filter)
Loading ingested raw CSV sightings...
... records from raw CSVs
Loading pre-computed angle records (SQM instrument data)...
1621 pre-computed angle records
Computing solar depression angles...
Dropping N record(s) with implausible angles (< 7.0° Fajr / < 10.0° Isha):
...
Fajr dataset: 48668 records → data/processed/fajr_angles.csv
Isha dataset: 34529 records → data/processed/isha_angles.csv
Records dropped with "implausible angles" are data entry or DST-transition artifacts. The quality filter (7° for Fajr, 10° for Isha) removes physically impossible values. All dropped records are logged so you can investigate them.
The OpenFajr Project runs a continuous community astrophotography program in Birmingham. A panel of scholars reviews daily sky photos and votes on the moment of true dawn. The voted times are published as a public Google Calendar iCal feed.
- ~4,018 records, 2016-2026
- Location: 52.4862°N, 1.8904°W, 141m elevation
- All times are UTC (Z suffix in iCal)
- Fetched live by the pipeline — no local cache needed
This is the highest-quality source: actual community-reviewed per-date timestamps at a single well-documented location. It provides ~68% of the Fajr training data.
1,621 per-night SQM records across 46 Indonesian sites, extracted from Basthoni's 2022 PhD
dissertation at UIN Walisongo. Each record is a direct instrument measurement where the Fajr
depression angle was determined by linear fitting of SQM time-series data. Loaded by
src/collect/precomputed_angles.py.
Located in src/collect/verified_sightings.py and per-source CSVs in data/raw/raw_sightings/.
These come from:
- Peer-reviewed academic papers (NRIAG Egypt, Malaysia, Indonesia, Saudi Arabia, Mauritania)
- Community observation programs (Miftahi/Shaukat UK, Asim Yusuf UK, Moonsighting.com)
- Institutional SQM data (BRIN Mount Timau, BRIN multistation network)
See Data Sources for the full citation table.
Open src/collect/verified_sightings.py and append to the VERIFIED_SIGHTINGS list:
{
"prayer": "fajr", # "fajr" or "isha"
"date_local": "2024-06-21", # ISO date, local calendar date
"time_local": "04:38", # HH:MM, 24-hour, local time at moment of sighting
"utc_offset": 1.0, # hours from UTC (e.g. 1.0 for BST, -5.0 for EST, 5.5 for IST)
"lat": 51.150, # decimal degrees (south = negative)
"lng": -3.650, # decimal degrees (west = negative)
"elevation_m": 430.0, # metres above sea level (0 = will be looked up by API)
"source": "Your citation here",
"notes": "Any relevant notes about conditions, method, observer count, etc.",
}| Region | UTC offset |
|---|---|
| UK (BST, summer) | +1.0 |
| UK (GMT, winter) | 0.0 |
| Egypt / Eastern Europe (EET) | +2.0 |
| Egypt / EE (summer, EEST) | +3.0 |
| Saudi Arabia / Arabia Standard | +3.0 |
| Iran (IRST) | +3.5 |
| Iran (IRDT, summer) | +4.5 |
| UAE / Oman (GST) | +4.0 |
| Pakistan (PKT) | +5.0 |
| India / Sri Lanka (IST) | +5.5 |
| Bangladesh (BST) | +6.0 |
| Malaysia / Singapore (MYT) | +8.0 |
| Indonesia West (WIB) | +7.0 |
| Indonesia East (WIT) | +9.0 |
| Australia East (AEST, winter) | +10.0 |
| Australia East (AEDT, summer) | +11.0 |
| New Zealand (NZST) | +12.0 |
| New Zealand (NZDT) | +13.0 |
| US Eastern (EST) | -5.0 |
| US Eastern (EDT) | -4.0 |
| US Central (CST) | -6.0 |
| US Central (CDT) | -5.0 |
| West Africa (WAT) | +1.0 |
| East Africa (EAT) | +3.0 |
| South Africa (SAST) | +2.0 |
After adding records, run the pipeline and check the output. A correctly entered record should produce an angle between 8° and 21° for Fajr, or 11° and 22° for Isha. If the pipeline drops your record (angle below the threshold), the time is too close to sunrise/sunset — recheck the UTC offset and local time.
python -m src.pipeline --no-elevation-lookup 2>&1 | grep -A5 "Dropping"The Isha dataset is the most critical gap at 46 records. Fajr has excellent Birmingham coverage but needs more geographic diversity:
| Gap | What to look for |
|---|---|
| Isha (all regions) | Shafaq al-Abyad disappearance logs with explicit per-date timestamps |
| South America | Any Muslim community observation records with coordinates and times |
| Southeast Asia | Additional Indonesian/Malaysian per-night SQM data files |
| High latitudes (55°N+) | Scandinavian or northern Canadian observation logs |
| Sub-Saharan Africa | Observation records from West Africa, East Africa, Southern Africa |