-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
table.join_skycoord fails with large input tables: too many matches. SkyCoord.search_around_sky works fine though #13054
Comments
Welcome to Astropy 👋 and thank you for your first issue! A project member will respond to you as soon as possible; in the meantime, please double-check the guidelines for submitting issues and make sure you've provided the requested details. GitHub issues in the Astropy repository are used to track bug reports and feature requests; If your issue poses a question about how to use Astropy, please instead raise your question in the Astropy Discourse user forum and close this issue. If you feel that this issue has not been responded to in a timely manner, please leave a comment mentioning our software support engineer @embray, or send a message directly to the development mailing list. If the issue is urgent or sensitive in nature (e.g., a security vulnerability) please send an e-mail directly to the private e-mail feedback@astropy.org. |
It looks like this issue about Getting the expected output from |
Thanks a lot for the quick answer. I understand the quick answer is keep using SkyCoord.search_around_sky if I want predictable and repeatable output not based on the input table length. As a naïve user, I would say I would never expect fuzzy logic to be applied to table joins, at least not as the default behaviour. I understand it might help analyzing really big tables, but the unpredictability renders it useless for me. I would really recommend updating the docstring, so people only use table.join_skycoord it when they know what they are doing . I would also suggest adding a way to do exact table joins on sky distance. Maybe the answer is to use SkyCoord.search_around_sky, but it should be explicit. Thanks again Alcione |
+1 on the above statements and thanks for reporting this. I was looking for the Astropy-way to join two tables, looping through the first table and find all matches within in a few arcsec radius from a second table (what I'd consider a fairly normal use case). I found the I think the current implementation is indeed unexpected (if not dangerous), for what one expects to be a deterministic operation with a single solution. At least a warning would be very useful. FWIW, here's what I ended up using, in case anyone is looking for something similar: import astropy.units as u
from astropy.table import hstack
from astropy.coordinates import SkyCoord
# two catalogs with to-be-defined ra and dec columns in some unit
c1 = SkyCoord(cat1[ra1], cat1[dec1], unit=unit1)
c2 = SkyCoord(cat2[ra2], cat2[dec2], unit=unit2)
# match within 5 arcsec; note reversed order c2 and c1
idc1, idc2, d2d, d3d = c2.search_around_sky(c1, 5*u.arcsec)
matched_tables = hstack([cat1[idc1], cat2[idc2]], join_type='exact') |
Dear Astropy team,
I am using search_around_sky to do a cross-match between two astropy tables. However, I find wrong results when the input tables are large (1000 and 852,128). However, SkyCoord.search_around_sky works fine. I provide an example code and input files reproducing the problem.
Any help would be appreciated
Thank you very much
Alcione Mora
t1.fits.zip
t2.fits.zip
The text was updated successfully, but these errors were encountered: