Skip to content

Fix globular clusters mis-typed as galaxies in NGC/IC import#445

Merged
brickbots merged 2 commits into
mainfrom
fix/ngc-globular-type
May 29, 2026
Merged

Fix globular clusters mis-typed as galaxies in NGC/IC import#445
brickbots merged 2 commits into
mainfrom
fix/ngc-globular-type

Conversation

@brickbots
Copy link
Copy Markdown
Owner

@brickbots brickbots commented May 29, 2026

Problem

A user reported NGC 7006 is listed as a galaxy but is a globular cluster. Investigation showed it's not alone — five globulars are mis-typed as Gx:

Object Steinicke TYPE Was Now
NGC 2808 I Gx Gb
NGC 5824 I Gx Gb
NGC 5834 I Gx Gb
NGC 6864 (M 75) I Gx Gb
NGC 7006 I Gx Gb

Root cause

For a globular, Steinicke's TYPE is the Shapley–Sawyer concentration class — a Roman numeral such as I — which is_galaxy_type() read as an Irregular galaxy. preprocess_steinicke_type() already had a rescue that reclassifies these as Gb when a GCL/globular cross-ID is present in the remarks, but it ran after the galaxy check, so it was dead code for exactly these objects.

Fix

Move the globular disambiguation ahead of the galaxy check. The existing is_trumpler_class() guard ensures objects that merely mention a globular in their remarks are not reclassified — e.g. IC 4802, a star group inside Pal 9, correctly stays Ast.

Changes

  • steinicke_loader.py — reorder the globular rescue before is_galaxy_type, with an explanatory comment.
  • test_catalog_data.py — add NGC 7006 and NGC 6864 (M 75) as Gb regression assertions.
  • pifinder_objects.db — regenerated via the full import pipeline.

Verification

All 4 tests in test_catalog_data.py pass; the rebuilt DB shows the 5 globulars as Gb and IC 4802 as Ast.

🤖 Generated with Claude Code

Test coverage (why this slipped through)

The previous tests only spot-checked a handful of famous objects in the built DB, and test_object_counts is blind to type errors (Gx vs Gb is still one object). Added python/tests/test_steinicke_parsing.py:

  • Unit tests of preprocess_steinicke_type() — one case per decision branch and known ambiguity (Roman-numeral galaxy/globular/Trumpler overlap, combinations, suffix stripping, the IC 4802 guard). No DB required.
  • An invariant audit over the real Steinicke source: every object with a globular cross-ID and a concentration-class TYPE must map to Gb — catching the whole class generically rather than enumerating named objects.

31 new tests; full suite (test_steinicke_parsing.py + test_catalog_data.py) green.

brickbots and others added 2 commits May 28, 2026 19:53
Steinicke's TYPE for a globular is its Shapley-Sawyer concentration class
(a Roman numeral, e.g. "I"), which is_galaxy_type() read as an Irregular
galaxy. The existing GCL-cross-ID rescue in preprocess_steinicke_type()
ran *after* the galaxy check, so it was dead code for these objects.

Move the globular disambiguation ahead of the galaxy check. The
is_trumpler_class() guard keeps objects that merely mention a globular in
their remarks (e.g. IC 4802, a star group inside Pal 9) from being
reclassified. Corrects NGC 7006, NGC 2808, NGC 5824, NGC 5834 and
NGC 6864 (M 75), previously typed Gx. Rebuilds pifinder_objects.db and
adds NGC 7006 / M 75 regression assertions.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The existing data-validation tests only spot-check a few well-known objects
in the built DB, so the globular-as-galaxy mis-typing went unnoticed. Add
direct coverage of the parsing layer where the bug lived:

- Tier 1: parametrized unit tests of preprocess_steinicke_type(), one case
  per decision branch and known ambiguity (Roman-numeral galaxy/globular/
  Trumpler overlap, combinations, suffix stripping, the IC 4802 guard).
- Tier 2: an invariant audit over the real Steinicke source asserting that
  every object with a globular cross-ID and a concentration-class TYPE maps
  to Gb -- catching the whole class generically, not just named objects.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@brickbots brickbots merged commit 1ea1234 into main May 29, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant