# HII Region Database Generator
Generates a database containing a subset of the information about Galactic HII regions in the
[WISE Catalog of Galactic HII Regions](http://astro.phys.wvu.edu/wise/)
([paper](https://ui.adsabs.harvard.edu/abs/2014ApJS..212....1A/abstract)) along with additional
information related to recent radio recombination line surveys. The database structure is described
in `schema.txt`.

First, download the data files from THIS LINK.

Then, if necessary, update the WISE Catalog database by running the notebook `generate_wise_catalog.ipynb`.

Copyright(C) 2025 by<br>
Trey V. Wenger; tvwenger@gmail.com<br>
L. D. Anderson; <br>
This code is licensed under MIT license (see LICENSE for details)

In [6]:
# Import packages
import os
from hii_db import generate
from hii_db import add_parallax
from hii_db import add_distances
from hii_db import utils

import importlib
importlib.reload(utils)

<module 'hii_db.utils' from '/home/twenger/science/python/hii_db/hii_db/utils.py'>

In [7]:
# Set filepath for data directory
data_dir = "/media/drive1/hii-db-data/"

# WISE Catalog CSV file
wise_master_version = 3.0
wise_csv_file = os.path.join(data_dir, "wise", f"WISE_HII_V{wise_master_version}.csv")

if not os.path.exists(wise_csv_file):
    raise FileNotFoundError(wise_csv_file)
print(f"Using WISE Catalog CSV: {wise_csv_file}")

# Output database file
db_version = "v2025_07_29"
db_file = os.path.join(data_dir, "db", f"hii_{db_version}.db")
wise_db_file = os.path.join(data_dir, "db", f"hii_{db_version}_wise.db")
print(f"Database will be overwritten: {db_file} and {wise_db_file}")

Using WISE Catalog CSV: /media/drive1/hii-db-data/wise/WISE_HII_V3.0.csv
Database will be overwritten: /media/drive1/hii-db-data/db/hii_v2025_07_29.db and /media/drive1/hii-db-data/db/hii_v2025_07_29_wise.db


# Build WISE Catalog-only Database

In [8]:
generate.main(wise_db_file, wise_csv_file, wise_only=True, data_dir=data_dir)

Resetting database...
Done!

Generating WISE Catalog table...
8589
8589
Done!

Generating Groups table...
Group G000.510-00.051 different VLSR nan vs nan
Group WB43 different VLSR nan vs nan
Group WB43 different VLSR nan vs nan
Done!

Matching Catalog to Groups...
Done!

Adding WISE Detections...
PROBLEM WITH G000.320-00.215 COMPONENTS
[9.6]
[0.5]
[27.2]
[0.7]
[nan]
[nan]
PROBLEM WITH G012.685+00.008 COMPONENTS
[ 34.1 109. ]
[0.7 0.9]
[30.4  nan]
[1.7 nan]
[0.18105     nan]
[0.03995     nan]
PROBLEM WITH G014.221-00.545 COMPONENTS
[19.7 54.5]
[0.8 1.7]
[28.4  nan]
[ 2. nan]
[0.11475     nan]
[0.0289    nan]
PROBLEM WITH G023.195-00.001 COMPONENTS
[90.7 22.4]
[0.7 1.7]
[28.3  nan]
[1.6 nan]
[0.12155     nan]
[0.02635     nan]
PROBLEM WITH G055.855-03.803 COMPONENTS
[27.6]
[0.2]
[24.2]
[0.3]
[nan]
[nan]
PROBLEM WITH G107.209-01.334 COMPONENTS
[-36.]
[0.4]
[26.9]
[0.7]
[nan]
[nan]
PROBLEM WITH G108.503+06.356 COMPONENTS
[-10.3]
[0.3]
[23.9]
[0.6]
[nan]
[nan]
PROBLEM WITH G114.605-00.801 C

# Add Parallax Data to WISE-only Database

In [11]:
reid19_datafile = os.path.join(data_dir, "masers", "reid2019_merge.txt")
reid19_reffile = os.path.join(data_dir, "masers", "reid2019_refs.txt")

add_parallax.main(wise_db_file, datafile=reid19_datafile, reffile=reid19_reffile)

Resetting Parallax and CatalogParallax tables...
Done.

Adding Reid+2019 Parallax sources...
Populating Parallax table...
Done!

Matching Parallax sources to WISE Catalog...
Matching WISE G000.314-00.194 to parallax G000.31-00.20 based on WISE size
Matching WISE G000.314-00.205 to parallax G000.31-00.20 based on WISE size
Matching WISE G000.320-00.215 to parallax G000.31-00.20 based on WISE size
Matching WISE G000.382+00.017 to parallax G000.37+00.03 based on separation (85.6 arcsec)
Matching WISE G000.657-00.041 to parallax G000.67-00.03 based on separation (65.5 arcsec)
Matching WISE G000.660-00.052 to parallax G000.67-00.03 based on separation (86.8 arcsec)
Matching WISE G000.666-00.036 to parallax G000.67-00.03 based on separation (28.4 arcsec)
Matching WISE G000.666-00.051 to parallax G000.67-00.03 based on separation (72.9 arcsec)
Matching WISE G000.670-00.043 to parallax G000.67-00.03 based on separation (40.2 arcsec)
Matching WISE G000.670-00.035 to parallax G000.67-00.03 based

# Compute Kinematic Distances for WISE-only Database

In [9]:
add_distances.main(wise_db_file, num_samples=100, batchsize=100, rotcurve="reid19_rotcurve", tablename="Distances_Reid2019")

Found 2126 unique Catalog sources with VLSR

Resetting Distances_Reid2019 table...
Done.

Computing kinematic distances (group 0)...
Done.

Computing kinematic distances (group 1)...
Done.

Computing kinematic distances (group 2)...
Done.

Computing kinematic distances (group 3)...
Done.

Computing kinematic distances (group 4)...


Process ForkPoolWorker-601:
Process ForkPoolWorker-602:
Process ForkPoolWorker-609:
Process ForkPoolWorker-607:
Process ForkPoolWorker-604:
Process ForkPoolWorker-606:
Process ForkPoolWorker-622:
Process ForkPoolWorker-605:
Process ForkPoolWorker-608:
Process ForkPoolWorker-624:
Process ForkPoolWorker-623:
Process ForkPoolWorker-615:
Process ForkPoolWorker-610:
Process ForkPoolWorker-603:
Process ForkPoolWorker-621:
Process ForkPoolWorker-618:
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
  File "/home/twenger/miniforge3/envs/hii_db/lib/python3.11/site-packages/multiprocess/process.py", line 314, in _bootstrap
    self.run()
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call 

KeyboardInterrupt: 

# Build All-RRL Database

In [20]:
# Import packages
import os
from hii_db import generate
from hii_db import add_parallax
from hii_db import add_distances
from hii_db import balser_te_2015
from hii_db import wenger_te_2019
from hii_db import brown_shrds_pilot
from hii_db import wenger_shrds_2019
from hii_db import wenger_shrds_2021

import importlib
importlib.reload(generate)
importlib.reload(balser_te_2015)
importlib.reload(wenger_te_2019)
importlib.reload(brown_shrds_pilot)
importlib.reload(wenger_shrds_2019)
importlib.reload(wenger_shrds_2021)

<module 'hii_db.wenger_shrds_2021' from '/home/twenger/science/python/hii_db/hii_db/wenger_shrds_2021.py'>

In [21]:
generate.main(db_file, wise_csv_file, wise_only=False, data_dir=data_dir)

Resetting database...
Done!

Generating WISE Catalog table...
8589
8589
Done!

Generating Groups table...
Group G000.510-00.051 different VLSR nan vs nan
Group WB43 different VLSR nan vs nan
Group WB43 different VLSR nan vs nan
Done!

Matching Catalog to Groups...
Done!

Adding WISE Detections...
PROBLEM WITH G000.320-00.215 COMPONENTS
[9.6]
[0.5]
[27.2]
[0.7]
[nan]
[nan]
PROBLEM WITH G012.685+00.008 COMPONENTS
[ 34.1 109. ]
[0.7 0.9]
[30.4  nan]
[1.7 nan]
[0.18105     nan]
[0.03995     nan]
PROBLEM WITH G014.221-00.545 COMPONENTS
[19.7 54.5]
[0.8 1.7]
[28.4  nan]
[ 2. nan]
[0.11475     nan]
[0.0289    nan]
PROBLEM WITH G023.195-00.001 COMPONENTS
[90.7 22.4]
[0.7 1.7]
[28.3  nan]
[1.6 nan]
[0.12155     nan]
[0.02635     nan]
PROBLEM WITH G055.855-03.803 COMPONENTS
[27.6]
[0.2]
[24.2]
[0.3]
[nan]
[nan]
PROBLEM WITH G107.209-01.334 COMPONENTS
[-36.]
[0.4]
[26.9]
[0.7]
[nan]
[nan]
PROBLEM WITH G108.503+06.356 COMPONENTS
[-10.3]
[0.3]
[23.9]
[0.6]
[nan]
[nan]
PROBLEM WITH G114.605-00.801 C