Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tool to get SFD dustmap values #733

Closed
eteq opened this issue Feb 5, 2013 · 40 comments
Closed

Tool to get SFD dustmap values #733

eteq opened this issue Feb 5, 2013 · 40 comments

Comments

@eteq
Copy link
Member

eteq commented Feb 5, 2013

(This is based in part on the discussion in #700)

We should add a function/class that retrieves the Schlegel, Finkbeiner, and Davis 1998 dustmaps, caches them, and uses that to get E(B-V) at arbitrary locations (specified by objects from astropy.coordinates). We probably also should include the R_X coefficients from the SFD paper, so people can get out extinctions directly in their preferred bandpass.

We might also include the improved Schlafly & Finkbeiner 2011 calibrations, although that might not be "standard" enough yet.

Note that there's already a python/numpy implementation of the interpolation scheme in astropysics. So most of the work is already done between that and the astropy data/caching framework.

@adrn
Copy link
Member

adrn commented Feb 6, 2013

This would be kickass, so I'd be happy to help!

@kbarbary
Copy link
Member

I'd also like to see this happen (although I'm no dust expert). To get the ball rolling, here's some initial thoughts I had about interface and implementation:

  • A single function seems like enough to me, unless there's a good reason to use classes. Names? mwdust(), dustval()?
  • Would this live in astropy.utils? I didn't closely follow the removal of astropy.tools.
  • Would we retrieve the maps directly from http://www.astro.princeton.edu/~schlegel/dust/data/data.html at least for the time being (since there's no astropy data server)?
  • We may wish to have a source keyword argument that can specify to retrieve the values from the IRSA web interface instead of downloading the maps (e.g. source='irsa' or source='maps'). In fact, we might want to make this the default so that users don't unwittingly run the function once and have to wait for a huge download to finish. That is, in order to download and cache the maps, you have to actually read the docs (where you'll be warned about the size of the download and cache).
  • Looks like astropysics uses scipy.ndimage for interpolation. I suppose we'll make interpolate an optional kwarg so that the function is usable without scipy?

@astrofrog
Copy link
Member

Does this belong in astroquery? and more generally, how do we decide what should live in astroquery, and what should get moved to the core?

@kbarbary
Copy link
Member

@astrofrog I was wondering the same about astroquery, but I'm not that familiar with it. I do think that if we are going to have two methods of retrieving values (from cached files or from IRSA), they should be accessible from the same function.

@eteq
Copy link
Member Author

eteq commented Feb 11, 2013

I think this does not belong in astroquery, because in my mind astroquery is about accessing science data, while this is primarily about getting the extinction value for a given point in the sky. In my mind this is a fundmental necessary tool/utility (although that may reflect my extragalactic bias). I guess that means the title of the issue wasn't quite right, but that's what I was thinking.

I agree with @kbarbary that one function seems fine, as well as interpolate should only work when scipy is present (unless there's a way to simply re-cast those operations into numpy-only code... I haven't really thought hard about it). I think it's also fine to pull straight from the princeton site. This may be the sort of thing one would always access from there anyway - in my mind the astropy data server was for data specific to the project, rather than for science data that we sometimes use (but that could change if bandwidth is an issue or something).

As for where it should live, that's an excellent question: I don't think it belongs in utils because that's now mostly for code users are unlikely to want. One possibility might be coordinates, as this is all about getting the dust corrections in a given direction. But that seems a little un-intuitive, I admit.

Maybe this needs to be in its own place? In the future there will probably be something for photometry, and this might make sense there. Alternatively, we could make something like obstools - that's the package this is in in astropysics, and I made it basically because there were some things that are useful observation tools, but don't otherwise have an obvious place (the other thing that naturally sits there would be specific forms of extinction laws, and possibly things like distance modulus calculations. It might even be a good place to put classes for fluxes/magnitudes.)

@eteq
Copy link
Member Author

eteq commented Feb 11, 2013

Oh, and now that I look a bit more, its not obvious to me that there's a good way to get the IRSA form of the data - the intent of that site seems to be pretty much for people viewing things on the web, so parsing the output might be needlessly complex.

@kbarbary
Copy link
Member

@eteq I have the same thinking about this being a fundamentally necessary tool for doing other things (and also the same extragalactic bias).

Location-wise, I feel like this belongs in a grab-bag of tools (that's where I would look first as a user). obstools is OK but that sounds like tools for observers or tools having to do directly with telescope observations, which this isn't, necessarily. What about misc like scipy has? Or perhaps it should go in utils after all. In the reference docs for utils, we could clearly separate into different sections the generally useful functions and those mainly of interest to developers.

There's a programmable interface for IRSA described here:
http://irsa.ipac.caltech.edu/applications/DUST/docs/dustProgramInterface.html

It returns an XML document, so you can parse the values out in about 5 lines of code.

@astrofrog
Copy link
Member

Small comment: as a non-extragalactic astronomer, I think we would need to make sure that it's made obvious this is the total extinction through the galaxy - obviously for us galactic observers, things are a lot more complex for the dust extinction.

Now for the bigger issue: to me, SFD isn't more fundamental than say SIMBAD, which is currently in astroquery. SFD is itself science data (granted, that's used for other science), and it's certainly not exact, so it's not truly 'fundamental'. The reason I want to keep these online services separate for now (and including SFD) is because they are much more prone to breakage. If the service's API changes then suddenly all Astropy users have to update and we need to release an update just for this, or they have to go and edit a configuration file.

I think there should be a bigger discussion on any kind of online querying in Astropy in future. We need to find ways to make it robust in the long term (e.g. developing an API layer we can control on a server we control, which translates the calls if the APIs ever change). Once we have a good idea how we can manage this, then I'm all for including things like SFD, Simbad, and Vizier (imagine Table.from_vizier('J/A+A/518/L73')) in the core package. But in the mean time, they can be developed and tested in astroquery. The purpose of astroquery was a kitchen sink for all kind on online querying, science or not, until we figure out which ones to include in the core and how.

@astrofrog
Copy link
Member

Naive question - is there a VO service for the SFD maps? If so, it could potentially go in astropy.vo.

@astrofrog
Copy link
Member

As to where this could belong once it did make it to the core - how about astropy.query? Then any online service we move from astroquery can naturally go into astropy.query.

@kbarbary
Copy link
Member

@astrofrog Pretty reasonable. Just to be clear, you're only talking about the functionality for querying IRSA, not downloading and caching the maps, right? Seems like your comment also applies to downloading the maps directly from an externally hosted site, such as the Princeton site. But if we had the maps on an astropy-controlled server, you'd be OK with the download-and-cache function being in the core package?

@kbarbary
Copy link
Member

For location I'm more concerned with the function that uses the maps directly -- and I don't think astropy.query is right for that, of course.

@astrofrog
Copy link
Member

@kbarbary - yes, if we hosted it ourselves, it would be easier to make sure the API doesn't break, but I'm not necessarily advocating storing everything ourselves, since in some cases the datasets are huge, but we can also provide API translation. Basically, I'm just suggesting that once 0.2 is out, we start a discussion on mailing list or google hangout about how to deal with online services in general.

@astrofrog
Copy link
Member

Regarding location, what do you mean by the function that uses the maps directly? If it's used to correct photometry, then it should go into astropy.photometry. But the querying code itself, which could be used directly by users, probably belongs in something like astropy.query (and trust me, we do not want to talk about astropy.misc and astropy.utils, nor astropy.tools!)

@kbarbary
Copy link
Member

I mean a function originally described in the issue which would be analogous to, e.g., DUST_GETVAL in IDL, but with automatic download and cache of the maps. I said astropy.query doesn't seem right because once the maps are downloaded and cached, you're not querying an external server. I suppose you're "querying" values in the maps on your hard drive... but that isn't how I'd normally think of it.

In supernova measurements, we tend to separate "photometry" from these dust corrections: Photometry measurements are summing up flux on your images. Dust corrections are something you do after you've determined the total observed flux of the source. This distinction is useful: for example, I would use these dust corrections for simulations that don't even go to the image level, and otherwise wouldn't touch astropy.photometry.

@astrofrog
Copy link
Member

IMHO, caching is only a convenience - you're still querying a dataset/database (note, for the reason you suggest, I would not support astropy.online or astropy.web). But in any case, I think this warrants a discussion on-list, and not just for SFD.

@kbarbary
Copy link
Member

Agreed. And astropy.query is starting to make more sense once I stop thinking about it as "online" and start thinking about as "dataset"...

@eteq
Copy link
Member Author

eteq commented Feb 12, 2013

I still think this isn't quite the same as astropy.query. Many extragalactic astronomers don't think of SFD as a dataset (even if maybe they should...) but rather as a prerequisite for doing just about anything that they're actually going to publish. And I think what @kbarbary said three posts up is the same as most think of it - it's not actually part of the measurement process, but an "interpretation" stage - usually it's the very last thing you do before you start plotting stuff.

I see @astrofrog's point on the IRSA version, though. That requires an actual API and interfacing, so all the other issues in astropy.query apply (and thanks for pointing out the XML interface to that, @kbarbary - I hadn't seen it before). But I think providing an easy way to download and cache the maps and use them in day-to-day work is a different matter. It's probably one of the most common questions I've heard: "How do I get the SFD dustmaps in python?" And it's actually literally what I was thinking of when I designed the cache stuff.

So I think @astrofrog is right that we need a wider discussion on the matter, but I think we need this function in some form no matter what.

@astrofrog
Copy link
Member

Just to clarify my position - I think the base code for SFD should be in astropy.query, but of course, there could be hooks to it elsewhere, in the photometry, or cosmology, etc. But the basic code is basically just querying a dataset/database, and I think it could live alongside code for querying SIMBAD, etc. The name resolution could also it into that sub-package.

@eteq
Copy link
Member Author

eteq commented Feb 12, 2013

Hmm, we might actually be talking across each other, @astrofrog - what do you mean by the "base" code? The main thing I'm thinking about in this issue is code that does the following:

  1. Takes in a coordinate and transforms it to ICRS
  2. Downloads the E(B-V) map from the Princeton server (with astropy caching)
  3. interpolates along the map in the way the SFD paper recommends
  4. converts that E(B-V) value to the appropriate A_X value, based on the table (using either SFD or the newer Schlafly & Finkbeiner approach, or an option for both).

So which, if any, of those are you thinking should be in query?

It might have not been clear from the original post, but I'm thinking of this as a useful convenience function like name_resolve. It's a very practical thing that people have asked me about a few times. So I was hoping it could get in soonish (ideally 0.3). astropy.query fully deserves quite a bit of thinking about, though, so I wouldn't want to rush it for this.

@astrofrog
Copy link
Member

@eteq - I would personally put 1-4 in astropy.query, and then we can put convenience methods that access this functionality anywhere you want, e.g. in classes in astropy.coordinates, astropy.photometry, astropy.cosmology, etc. I don't feel strongly about this though, but just want to make sure that we make whatever system we have robust to web API breakage.

@zblz
Copy link
Member

zblz commented Feb 13, 2013

A similar tool that would be interesting to include for X-ray astonomy is a function to obtain the total galactic neutral hydrogen column density for given coordinates. A good all-sky survey to use is the Leiden/Argentine/Bonn (LAB) Survey of Galactic HI, found in VizieR at http://vizier.cfa.harvard.edu/viz-bin/Cat?VIII/76

It is similar to the SFD dustmap in that a simple download of the catalog is not enough, some interpolation would be required to get the value at the given coordinates. The is an FTOOL that does exactly this, called nH, as well as a web interface. A simpler possibility could be querying the web interface, but its functionality would be limited to the nH tool and vulnerable to changes in the web interface. A source keyword based solution as proposed by @kbarbary above could be used to choose between VizieR-map-download or web-query.

Another useful tool would be a calculator of galactic gamma-ray diffuse emission from code such as GALPROP, which is used in Fermi/LAT analysis. However, I believe there is no available table-like output, but just the code to generate your own. An all-sky, low-resolution map could be distributed with astropy, or use a astropy-hosted higher-resolution map.

@eteq
Copy link
Member Author

eteq commented May 9, 2013

@astrofrog - after both forgetting about this and then thinking about it for a little while, what I was trying to say above is that it seems to me that 1,3, and 4 are not "query" related, because they're all about things like coordinates and mapping, and could apply just as well if you downloaded the dust maps yourself. That is, they are not tied to a specific query or web interface, but rather a specific dataset (which is very unlikely to be modified ever again, although the ways of querying it might). The idea is that the dust maps just give you the (for example) E(B-V) values in a given sightline, and after that it's up to you what to do with it (sometimes it's photometric corrections, sometimes it's correcting a spectrum, and sometimes it's computing a column density or something).

It may be this is just semantics, though - it sounds like we agree that something like this should be in the core, the question is just exactly which parts go in which subpackages.

(As an aside, there are plenty of other sources of dust maps, but most of them make sense to approach as a query rather than actually downloading maps - that's what astropy/astroquery#82 is about)

@astrofrog
Copy link
Member

@eteq - I wonder whether part of this could be implemented as: given an NDData object with WCS information, and given a list of coordinate objects, extract e.g. the nearest pixel, or interpolate at that position. This is something that would be generically useful and then could be used here. Maybe this is finally a chance to try and get useful functionality into NNData? It would finally also give us an excuse to sub-class NDData into a 'Image/Map' class which is a 2-d map where both WCS coordinates are sky coordinates (maybe 'SkyMap'?).

@kbarbary
Copy link
Member

This might also be useful for situations where you have a 3-or-greater dimension grid of data and want to interpolate between data points. For example, suppose you have a grid of galaxy spectra models for various ages and metallicities: I don't know of an easy way to extract a single spectrum, or flux values for a single given (age, metallicity) pair (not necessarily in the grid). If your data are only 2d, you can use scipy.interpolate.RectBivariateSpline. On the other hand, scipy.ndimage.interpolation.map_coordinates can do the n-d interpolation, but only once you have the desired position in "pixel" coordinates.

@eteq
Copy link
Member Author

eteq commented May 14, 2013

👍 to @astrofrog's idea here. That's an excellent generalization of this use case. I think the starting point would be to use scipy.ndimage.interpolation.map_coordinates, because the wcs lets you go both ways easily enough. Anything much further beyond that is a big can of worms because it becomes more "fitting" than interpolation.

@cdeil
Copy link
Member

cdeil commented Jun 5, 2013

I just saw this discussion by chance and while I'm not interested in dustmap values, I'm interested in astroquery and world-coordinate-bases lookup / interpolation (there's issue #619, it's probably just a few lines of code, but so far it hasn't been clear where to put this in astropy, please comment there and I can do it).

Some of the issues mentioned are on the agenda for discussion the astroquery hangout on Friday (announcement, agenda).

Note that in astroquery there are already interfaces to Vizier, IRSA, Fermi.
I don't think they currently contain the functionality asked for here, but maybe it's better have everything in one place: astroquery for now, maybe moved to astropy.query in half a year when everything's in good shape and has been tested for a while?

@AstroMike
Copy link

There is now astroquery.irsa_dust
http://astroquery.readthedocs.org/en/latest/irsa/irsa_dust.html
that can give the SFD or SandF11 values for different filter combinations

@astrofrog
Copy link
Member

@eteq - can this be closed now it's in astroquery?

@kbarbary
Copy link
Member

There's still a need to get values of the full locally stored fits images. When you're doing thousands of positions, the web query can take a significant amount of time (dominant in a real use case I had).

@eteq
Copy link
Member Author

eteq commented Jul 16, 2014

@astrofrog - It's not quite the same thing as this issue, I think. astroquery only provides a way to use the IRSA interface to the dustmaps - it doesn't actually get the dustmaps for you or anything. The IRSA interface is good for many things, but if you want to do something like get the extinction for a bunch of objects, that's much more efficient to do with a local copy of the dustmap.

EDIT: posted this before seeing @kbarbary's, but same answer 😉

@AstroMike
Copy link

Yes, it all depends on your use: for a few objects, you don't want to download the SFD dust maps because they are relatively large (200 MB). On the other hand, for a large number of extinction calls, the query is slow.

@sebaguro
Copy link

sebaguro commented Oct 9, 2015

@eteq had an excellent method in "astropysics" to return E(B-V) values for multiple sources it was quick and one needed to first download the SFD_dust_4096_ngp.fits SFD_dust_4096_sgp.fits (granted)

I think astropy + affilated packages (incl. astroquery) currently do not efficiently support a E(B-V) query for a large number of sources. If true, would it not be worth reconsidering to include the etud astropysics code into astropy (as mentioned by Erik in one of the related tickets)?

@kbarbary
Copy link
Member

kbarbary commented Oct 9, 2015

I haven't looked at the implementation in astropysics lately, but we have similar code in sncosmo that could be upstreamed:

http://sncosmo.readthedocs.org/en/v1.1.x/api/sncosmo.SFD98Map.html (class interface)
http://sncosmo.readthedocs.org/en/v1.1.x/api/sncosmo.get_ebv_from_map.html (functional interface)

It also requires the user to download the maps manually, just because I felt dodgy about automatically dumping 200MB+ into ~/.cache/.

@sebaguro
Copy link

I guess being able to access many E(B-V) values (for many sources)
is useful for many extra-galactic projects and will become increasingly
useful in LSST era. Maybe it is a good idea to have kbarbary "sncosmo" or etud
"astropysics" version in core because of this argument?

Do I understand correctly that the newer 2011 E(B-V) values are just
found by re-scaling the 1998 E(B-V) values by a constant value? Such that:

E(B − V)2011 = 0.86[E(B − V)1998]

@kbarbary
Copy link
Member

For anyone finding this issue: there are now a few stand-alone pip packages that provide this functionality:

Brief comparisons here.

@bwinkel
Copy link
Contributor

bwinkel commented May 13, 2017

A similar tool that would be interesting to include for X-ray astonomy is a function to obtain the total galactic neutral hydrogen column density for given coordinates. A good all-sky survey to use is the Leiden/Argentine/Bonn (LAB) Survey of Galactic HI, found in VizieR at http://vizier.cfa.harvard.edu/viz-bin/Cat?VIII/76

Please don't use LAB data anymore. Even the main author of the data release paper (Peter Kalberla, a colleague of mine) considers it as deprecated. There is a much better alternative now: HI4PI. HI column densities are provided in HEALPix and various other projections.

@cdeil
Copy link
Member

cdeil commented Jul 6, 2018

@eteq - Given that this is available in other packages now (see comment #733 (comment) by @kbarbary above), maybe close this issue from 2013? Or would you still prefer to add / maintain this feature in Astropy core?

@kakirastern
Copy link
Contributor

Hi, am triaging and would like to check if this Issue should be kept open... Would opening a new issue and continue the discussion there in light of the developments leading up to late Feb 2019 be more favorable?

Please advise. Thanks!

@bsipocz
Copy link
Member

bsipocz commented Feb 22, 2019

I'm going ahead and close this issue. As @kakirastern pointed out, not much happened in core about it but in the meantime other packages and affiliated packages have been created to cover the topic.

@bsipocz bsipocz closed this as completed Feb 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests