In [1]:
import warnings
warnings.simplefilter('ignore')

# Marvin Queries
This tutorial goes through a few basics of how to perform queries on the MaNGA dataset using the Marvin Query tool. Please see the [Marvin Query](../../query/query.rst) page for more details on how to use Queries.  This tutorial covers the basics of:


 * querying on metadata information from the NSA catalog
 * how to combine multiple filter and return additional parameters
 * how to perform radial cone searches with Marvin
 * querying on information from the MaNGA DAPall summary file
 * querying using quality and target flags 

First let's import some basics

In [4]:
# we should be using DR17 MaNGA data
from marvin import config
config.release

# import the Query tool
from marvin.tools.query import Query

## Query Basics  
### Querying on Metadata 
Let's go through some Query basics of how to do a query on metadata.  The two main keyword arguments to Query are **search_filter** and **return_params**.  **search_filter** is a string representing the SQL ``where`` condition you'd like to filter on.  This tutorial assumes a basic familiarity with the SQL boolean syntax needed to construct Marvin Queries.  Please see the [tutorial on SQL Boolean syntax](../boolean-search-tutorial.rst) to learn more. **return_params** is a list of parameters you want to return in the query in addition to those used in the SQL filter condition. 

Let's search for all galaxies with a redshift less than 0.1. To specify our search parameter, redshift, we must know the database table and name of the parameter. In this case, MaNGA uses the NASA-Sloan Atlas (NSA) for redshift information.  In the NSA catalog, the redshift is the **z** parameter of the **nsa** table, so our search parameter will be ``nsa.z``.  Generically, all search parameters will take the form `table.parameter`.

In [5]:
# filter for galaxies with a redshift < 0.1
my_filter = 'nsa.z < 0.1'

In [6]:
# construct the query
q = Query(search_filter=my_filter)

# run the query
r = q.run()

# print some stuff
print(r)
print('number of results:', r.totalcount)

Marvin Results(query=nsa.z < 0.1, totalcount=9524, count=100, mode=local)
number of results: 9524


After constructing queries, we can run them with **q.run()**.  This returns a **Marvin Results** object. Let's take a look.  This query returned 4275 objects.  For queries with large results, the results are automatically paginated in sets of 100 objects.  Default parameters returned in queries always include the **mangaid** and **plateifu**.  Marvin Queries will also return any parameters used in the definition of your filter condition. Since we filtered on redshift, the redshift is automatically included.

In [7]:
# look at the current page of results (subset of 10)
print('number in current set:', len(r.results))
print(r.results[0:10])

number in current set: 100
<ResultSet(set=1.0/953, index=0:10, count_in_set=10, total=9524)>
[ResultRow(mangaid='1-1009', plateifu='11866-12705', z=0.064824656),
 ResultRow(mangaid='1-10166', plateifu='12514-9101', z=0.08058354),
 ResultRow(mangaid='1-10177', plateifu='12514-3704', z=0.048428684),
 ResultRow(mangaid='1-10263', plateifu='12514-1902', z=0.020132923),
 ResultRow(mangaid='1-1033', plateifu='10843-12704', z=0.012681201),
 ResultRow(mangaid='1-1033', plateifu='11866-9101', z=0.012681201),
 ResultRow(mangaid='1-1037', plateifu='10843-12703', z=0.039029263),
 ResultRow(mangaid='1-10375', plateifu='12510-1902', z=0.038751837),
 ResultRow(mangaid='1-1038', plateifu='11866-6102', z=0.033372667),
 ResultRow(mangaid='1-106204', plateifu='12068-6104', z=0.057716995)]


### Multiple Search Criteria and Returning Additional Parameters
We can easily combine query filter conditions by constructing a boolean string using AND.  Let's search for galaxies with a redshift < 0.1 and log M$_\star$ < 10.  The NSA catalog contains the Sersic profile determination for stellar mass, which is the **sersic_mass** or **sersic_logmass** parameter of the **`nsa`** table, so our search parameter will be **nsa.sersic_logmass**.  

Let's also return the object RA and Dec as well using the **return_params** keyword.  This accepts a list of string parameters.  Object RA and Dec are included in the **cube** table so the parameter names are `cube.ra` and `cube.dec`.

In [8]:
my_filter = 'nsa.z < 0.1 and nsa.sersic_logmass < 10'
q = Query(search_filter=my_filter, return_params=['cube.ra', 'cube.dec'])
r = q.run()
print(r)
print('Number of objects:', r.totalcount)

Marvin Results(query=nsa.z < 0.1 and nsa.sersic_logmass < 10, totalcount=4216, count=100, mode=local)
Number of objects: 4216


This query return 1932 objects and now includes the RA, Dec, redshift and log Sersic stellar mass parameters. 

In [9]:
# print the first 10 rows
r.results[0:10]

<ResultSet(set=1.0/422, index=0:10, count_in_set=10, total=4216)>
[ResultRow(mangaid='1-10263', plateifu='12514-1902', ra=200.400259414, dec=0.573031983277, sersic_logmass=8.915024660185969, z=0.020132923),
 ResultRow(mangaid='1-1033', plateifu='11866-9101', ra=149.707718089, dec=0.836657711419, sersic_logmass=8.849841125765641, z=0.012681201),
 ResultRow(mangaid='1-1033', plateifu='10843-12704', ra=149.707718089, dec=0.836657711419, sersic_logmass=8.849841125765641, z=0.012681201),
 ResultRow(mangaid='1-1037', plateifu='10843-12703', ra=149.657311739, dec=0.870762819465, sersic_logmass=9.406284586801183, z=0.039029263),
 ResultRow(mangaid='1-10375', plateifu='12510-1902', ra=202.924176092, dec=-0.466036685729, sersic_logmass=9.232796161922877, z=0.038751837),
 ResultRow(mangaid='1-106663', plateifu='12071-9101', ra=347.324231618, dec=0.0644395785777, sersic_logmass=9.415455347717751, z=0.015748328),
 ResultRow(mangaid='1-106670', plateifu='12071-9102', ra=347.345531209, dec=1.00060002

## Radial Queries in Marvin
Cone searches can be performed with Marvin Queries using a special **functional** syntax in your SQL string. Cone searches can be performed using the special ``radial`` string function.  The syntax for a cone search query is **radial(RA, Dec, radius)**.  Let's search for all galaxies within 0.5 degrees centered on RA, Dec = 232.5447, 48.6902.  The RA and Dec must be in decimal degrees and the radius is in units of degrees. 

In [10]:
# build the radial filter condition
my_filter = 'radial(232.5447, 48.6902, 0.5)'
q = Query(search_filter=my_filter)
r = q.run()
print(r)
print(r.results)

Marvin Results(query=radial(232.5447, 48.6902, 0.5), totalcount=2, count=2, mode=local)
<ResultSet(set=1.0/1, index=0:2, count_in_set=2, total=2)>
[ResultRow(mangaid='1-209232', plateifu='8485-1901', ra=232.544703894, dec=48.6902009334),
 ResultRow(mangaid='1-209266', plateifu='8485-9101', ra=233.107502765, dec=48.8332849239)]


## Queries using DAPall parameters.  
MaNGA provides derived analysis properties in its **dapall** summary file.  Marvin allows for queries on any of the parameters in the file.  The table name for these parameters is **dapall**.  Let's find all galaxies that have a total measure star-formation rate > 5 M$_\odot$/year.  The total SFR parameter in the DAPall table is ``sfr_tot``.      

In [11]:
my_filter = 'dapall.sfr_tot > 5'
q = Query(search_filter=my_filter)
r = q.run()
print(r)
print(r.results)

Marvin Results(query=dapall.sfr_tot > 5, totalcount=98, count=98, mode=local)
<ResultSet(set=1.0/1, index=0:98, count_in_set=98, total=98)>
[ResultRow(mangaid='1-114035', plateifu='8619-6103', sfr_tot=6.2517962, bintype_name='HYB10', template_name='MILESHC-MASTARHC2'),
 ResultRow(mangaid='1-114035', plateifu='8619-6103', sfr_tot=12.301004, bintype_name='HYB10', template_name='MILESHC-MASTARSSP'),
 ResultRow(mangaid='1-114532', plateifu='7975-3703', sfr_tot=50.95404, bintype_name='HYB10', template_name='MILESHC-MASTARSSP'),
 ResultRow(mangaid='1-114532', plateifu='7975-3703', sfr_tot=50.912937, bintype_name='HYB10', template_name='MILESHC-MASTARHC2'),
 ResultRow(mangaid='1-114928', plateifu='7977-3702', sfr_tot=2154.7244, bintype_name='HYB10', template_name='MILESHC-MASTARSSP'),
 ResultRow(mangaid='1-118449', plateifu='12089-12701', sfr_tot=5.211467, bintype_name='HYB10', template_name='MILESHC-MASTARSSP'),
 ResultRow(mangaid='1-118449', plateifu='12089-12701', sfr_tot=5.1740203, bintyp

The query returns 6 results, but looking at the plateifu, we see there are only 3 unique targets.  This is because the DAPall file provides measurements for multiple bintypes and by default will return entries for all bintypes.  We can select those out using the ``bintype.name`` parameter. Let's filter on only the HYB10 bintype. 

In [12]:
my_filter = 'dapall.sfr_tot > 5 and bintype.name==HYB10'
q = Query(search_filter=my_filter)
r = q.run()
print(r)
print(r.results)

Marvin Results(query=dapall.sfr_tot > 5 and bintype.name==HYB10, totalcount=98, count=98, mode=local)
<ResultSet(set=1.0/1, index=0:98, count_in_set=98, total=98)>
[ResultRow(mangaid='1-114035', plateifu='8619-6103', bintype_name='HYB10', sfr_tot=6.2517962, template_name='MILESHC-MASTARHC2'),
 ResultRow(mangaid='1-114035', plateifu='8619-6103', bintype_name='HYB10', sfr_tot=12.301004, template_name='MILESHC-MASTARSSP'),
 ResultRow(mangaid='1-114532', plateifu='7975-3703', bintype_name='HYB10', sfr_tot=50.95404, template_name='MILESHC-MASTARSSP'),
 ResultRow(mangaid='1-114532', plateifu='7975-3703', bintype_name='HYB10', sfr_tot=50.912937, template_name='MILESHC-MASTARHC2'),
 ResultRow(mangaid='1-114928', plateifu='7977-3702', bintype_name='HYB10', sfr_tot=2154.7244, template_name='MILESHC-MASTARSSP'),
 ResultRow(mangaid='1-118449', plateifu='12089-12701', bintype_name='HYB10', sfr_tot=5.211467, template_name='MILESHC-MASTARSSP'),
 ResultRow(mangaid='1-118449', plateifu='12089-12701', b

## Query on Quality and Target Flags 
Marvin includes the ability to perform queries using quality or target flag information. These work using the special **quality** and **targets** keyword arguments.  These keywords accept a list of flag maskbit labels provided by the [Maskbit Datamodel](../../datamodel/dr15.rst#dr15-maskbits).  These keywords are inclusive, meaning they will only filter on objects satisfying those labels. 

### Searching by Target Flags
Let's find all galaxies that are in the MaNGA MAIN target selection sample. Targets in the MAIN sample are a part of the PRIMARY, SECONDARY and COLOR-ENHANCED samples.  These are the **primary**, **secondary**, and **color-enhanced** flag labels. The **targets** keywords accepts all labels from the MANGA_TARGET1, MANGA_TARGET2, or MANGA_TARGET3 maskbit schema. 

In [13]:
# create the targets list of labels
targets = ['primary', 'secondary', 'color-enhanced']
q = Query(targets=targets)
r = q.run()
print(r)
print('There are {0} galaxies in the main sample'.format(r.totalcount))
print(r.results[0:5])

Marvin Results(query=None, totalcount=5232, count=100, mode=local)
There are 5232 galaxies in the main sample
<ResultSet(set=1.0/1047, index=0:5, count_in_set=5, total=5232)>
[ResultRow(mangaid='1-1009', plateifu='11866-12705', manga_target1=2080),
 ResultRow(mangaid='1-10166', plateifu='12514-9101', manga_target1=1040),
 ResultRow(mangaid='1-10177', plateifu='12514-3704', manga_target1=2336),
 ResultRow(mangaid='1-10263', plateifu='12514-1902', manga_target1=1168),
 ResultRow(mangaid='1-1033', plateifu='11866-9101', manga_target1=1168)]


The **targets** keyword is equivalent to the ``cube.manga_targetX`` search parameter, where `X` is 1, 2, or 3.  The bits for the primary, secondary, and color-enhanced samples are 10, 11, and 12, respectively.  These combine into the value 7168.  The above query is equivalent to the filter condition ``cube.manga_target1 & 7168`` 

In [14]:
value = 1<<10 | 1<<11 | 1<<12
my_filter = 'cube.manga_target1 & {0}'.format(value)
q = Query(search_filter=my_filter)
r = q.run()
print(r)

Marvin Results(query=cube.manga_target1 & 7168, totalcount=9853, count=100, mode=local)


Let's search only for galaxies that are ``Milky Way Analogs`` or ``Dwarfs`` ancillary targets. 

In [15]:
targets = ['mwa', 'dwarf']
q = Query(targets=targets)
r = q.run()
print(r)
print('There are {0} galaxies from the Milky Way Analogs and Dwarfs ancillary target catalogs'.format(r.totalcount))
print(r.results)

Marvin Results(query=None, totalcount=69, count=69, mode=local)
There are 69 galaxies from the Milky Way Analogs and Dwarfs ancillary target catalogs
<ResultSet(set=1.0/1, index=0:69, count_in_set=69, total=69)>
[ResultRow(mangaid='1-114147', plateifu='8619-12705', manga_target3=16384),
 ResultRow(mangaid='1-114253', plateifu='8619-12704', manga_target3=16384),
 ResultRow(mangaid='1-117559', plateifu='12087-6104', manga_target3=8192),
 ResultRow(mangaid='1-118202', plateifu='12089-1901', manga_target3=8396800),
 ResultRow(mangaid='1-121994', plateifu='9485-9102', manga_target3=16384),
 ResultRow(mangaid='1-123217', plateifu='11745-6101', manga_target3=8192),
 ResultRow(mangaid='1-124604', plateifu='8439-6103', manga_target3=8192),
 ResultRow(mangaid='1-131989', plateifu='8546-12702', manga_target3=16384),
 ResultRow(mangaid='1-135491', plateifu='9869-6101', manga_target3=8192),
 ResultRow(mangaid='1-135927', plateifu='11947-6102', manga_target3=8192),
 ResultRow(mangaid='1-146067', pla

### Searching by Quality Flags
The **quality** accepts all labels from the MANGA_DRPQUAL and MANGA_DAPQUAL maskbit schema.  Let's find all galaxies that suffered from bad flux calibration.  This is the flag **BADFLUX** (bit 8) from the MANGA_DRPQUAL maskbit schema.  

In [16]:
quality = ['BADFLUX']
q = Query(quality=quality)
r = q.run()
print(r)
print('There are {0} galaxies with bad flux calibration'.format(r.totalcount))
print(r.results[0:10])

Marvin Results(query=None, totalcount=118, count=118, mode=local)
There are 118 galaxies with bad flux calibration
<ResultSet(set=1.0/12, index=0:10, count_in_set=10, total=118)>
[ResultRow(mangaid='1-108143', plateifu='12772-6104', quality=1073746242),
 ResultRow(mangaid='1-10856', plateifu='11024-6104', quality=16640),
 ResultRow(mangaid='1-109860', plateifu='12685-6101', quality=1073746242),
 ResultRow(mangaid='1-113273', plateifu='7972-1902', quality=1073742146),
 ResultRow(mangaid='1-121035', plateifu='8144-1901', quality=1073742144),
 ResultRow(mangaid='1-135564', plateifu='11941-12705', quality=16704),
 ResultRow(mangaid='1-138351', plateifu='12490-1901', quality=1073746242),
 ResultRow(mangaid='1-149170', plateifu='8997-3701', quality=1073742144),
 ResultRow(mangaid='1-152769', plateifu='8936-3704', quality=1073742144),
 ResultRow(mangaid='1-154049', plateifu='11742-3704', quality=16704)]


The **quality** keyword is equivalent to the search parameters ``cube.quality`` for DRP flags or the ``file.quality`` for DAP flags.  The above query is equivalent to ``cube.quality & 256``.  You can also perform a NOT bitmask selection using the ``~`` symbol.  To perform a NOT selection we can only use the ``cube.quality`` parameter. Let's select all galaxies that do not have bad flux calibration.  

In [17]:
# the above query as a filter condition
q = Query(search_filter='cube.quality & 256')
r = q.run()
print('Objects with bad flux calibration:', r.totalcount)

# objects with bad quality other than bad flux calibration
q = Query(search_filter='cube.quality & ~256')
r = q.run()
print('Bad objects with no bad flux calibration:', r.totalcount)

Objects with bad flux calibration: 118
Bad objects with no bad flux calibration: 4044


To find exactly objects with good quality and no bad flags set, use ``cube.quality == 0``. 

In [18]:
q = Query(search_filter='cube.quality == 0')
r = q.run()
print(r)
print('Objects with good quality:', r.totalcount)

Marvin Results(query=cube.quality == 0, totalcount=7229, count=100, mode=local)
Objects with good quality: 7229


## Useful Resources

Check out these pages on the Marvin Docs site for more information querying with Marvin.

- [Query](../../query/query.rst)
- [Query Datamodel](../../datamodel/query_dm.rst)
- [Results](../../query/results.rst)
- [SQL Boolean Syntax Tutorial](../boolean-search-tutorial.rst)