### Identify objects in SDSS Moving-Objects Catalogue DR1 with rocks.

In [1]:
import numpy as np
import pandas as pd
import rocks

import nest_asyncio  # these two lines are required for asynchronous 
nest_asyncio.apply()  # operations in jupyter notebooks 

Download SDSS MOC1 (6.2MB) from [https://faculty.washington.edu/ivezic/sdssmoc/sdssmoc1.html](https://faculty.washington.edu/ivezic/sdssmoc/sdssmoc1.html)

In [2]:
data = pd.read_fwf(
    "https://faculty.washington.edu/ivezic/sdssmoc/ADR1.dat.gz",
    colspecs=[(244, 250), (250, 270)],
    names=["numeration", "designation"],
)

print(f"Number of observations in SDSS MOC1: {len(data)}")

# Remove the unknown objects
data = data[data.designation.str.strip(" ") != "-"]
print(f"Observations of known objects: {len(set(data.designation))}")

# Unnumbered objects should be NaN
data.loc[data.numeration == 0, "numeration"] = np.nan

Number of observations in SDSS MOC1: 58117
Observations of known objects: 10585


Get current designations and numbers for objects.

In [3]:
# Create list of identifiers by merging 'numeration' and 'designation' columns
ids = data.numeration.fillna(data.designation)
print("Identifying known objects in catalogue..")
names_numbers = rocks.identify(ids)


Identifying known objects in catalogue..


The names and numbers are returned in the order of the passed identifiers. We can add them to the SDSS data using a simple list comprehension.

In [4]:
# Add numbers and names to data
data["name"] = [name_number[0] for name_number in names_numbers]
data["number"] = [name_number[1] for name_number in names_numbers]

# Print part of the result
data.number = data.number.astype("Int64")  # Int64 supports integers and NaN
data.head()

Unnamed: 0,numeration,designation,name,number
1,,1999_RL189,1999 RL189,159415
3,11659.0,1997_EX41,1997 EX41,11659
7,3633.0,Mira,Mira,3633
8,,2765_P-L,2765 P-L,39383
9,,2000_SR274,2000 SR274,62569
