# How to get and use MOC data from VESPA in Python with ElasticSearch

Illustrate search for MOC data in VESPA using ElasticSearch.
It also gives an example of a simple query to group together MOCs with the same value for a specific field (channel id), and then give the elements that are in the area defined by the intersection of all of those MOCs.

## Import required modules

In [1]:
from elasticsearch import Elasticsearch

## Connect to the ElasticSearch server

You can change the following url to your ElasticSearch server, and provide a password or a certificate file if needed

In [2]:
es = Elasticsearch('http://voparis-elasticsearch.obspm.fr:9200')

## Get a list of the indices

Display a list of all the indices in the server.
Each index contains documents and a document here is an item with fields (coverage, target_name, ...)

In [3]:
print("Indices list:\n")
for index in es.indices.get(index='*'):
    print(index)

Indices list:

.ds-.monitoring-es-8-mb-2022.10.20-000001
.ds-.monitoring-es-8-mb-2022.10.23-000002
.ds-.monitoring-es-8-mb-2022.10.26-000003
.ds-.monitoring-es-8-mb-2022.10.29-000004
.ds-.monitoring-es-8-mb-2022.11.01-000005
.ds-.monitoring-es-8-mb-2022.11.04-000006
.ds-.monitoring-es-8-mb-2022.11.07-000007
.ds-.monitoring-es-8-mb-2022.11.10-000008
.ds-.monitoring-es-8-mb-2022.11.13-000009
.ds-.monitoring-es-8-mb-2022.11.16-000010
.ds-.monitoring-es-8-mb-2022.11.19-000011
.ds-metricbeat-8.4.3-2022.10.20-000001
.ds-metricbeat-8.4.3-2022.11.19-000002
exoplanet
exoplanet_index
moc-index
obsfacility_index
test_chloe
vespa-index
vespa_index


## Get the number of items in moc-index in Vvex, grouped by target name

In this Notebook we will only consider the service Vvex in the index moc-index.
To do so we use a term query to only get documents with service_title = vvex. And we specify in the search function parameters, the index 'moc-index' (in the variable indexName)
But you can change it in the query variable below.
This query use an aggregation to count the number of documents in moc-index, with the service Vvex, by grouping them by target_name

In [4]:
indexName = 'moc-index'

In [5]:
query = {
    "bool": {
         "must": [
             {
                "term": {
                    "service_title.keyword": {
                        "value": "vvex"
                    }
                }
            }
         ]
    }
}

aggs = {
    "agg-test-terms-count" : {
        "terms": {
            "field" : "target_name.keyword",
            "size": 5
        },
    }
}


page = es.search(
    index=indexName,
    query=query,
    size=0,
    fields=["*"],
    aggs = aggs
)

for bucket in page["aggregations"]["agg-test-terms-count"]["buckets"] :
    print(bucket["key"],":",bucket["doc_count"])

Venus : 45304
Earth : 292
Star : 16
Mars : 8
Mercury : 8


## Check if an index has a coverage field

We can check if an index has a coverage field by looking at its mapping

In [6]:
def hasCoverage(es, indexName) :
    return "coverage" in es.indices.get_mapping(index=indexName)[indexName]["mappings"]["properties"]

In [7]:
if not hasCoverage(es,indexName) :
    print("There is no coverage field in the index ", indexName)
else :
    print("There is a coverage field in the index ", indexName)

There is a coverage field in the index  moc-index


## Import modules for Aladin and MOCs

In [8]:
from mocpy import MOC
from ipyaladin import Aladin

## How to load a MOC in Aladin

In this section we will see how to load a MOC in Aladin.

### Get the data

In the following query we check if there is a coverage, but we are using another method : we use an exists query instead of the mapping, to check if the field exists.

In [9]:
query = {
    "bool": {
        "must" : [
            {
                "exists" : {
                    "field" : "coverage"
                }
            },
            {
                "term": {
                    "service_title.keyword": {
                        "value": "vvex"
                    }
                }
            }
        ]
    }
}


page = es.search(
    index=indexName,
    query=query,
    size=10,
    fields=["*"],
)

data = []
for document in page["hits"]["hits"] :
    data.append(document["_source"])
    print(document["_source"]["granule_uid"])

VV0086_19G
VV0117_02G
VV0085_15C
VV0084_11C
VV0084_14C
VV0084_10G
VV0084_03G
VV0083_03C
VV0082_13C
VV0082_13G


### Load Aladin and add the data that was fetched just above

Here, and in the following examples, we are using the HiPS (Hierarchical Progressive Surveys) of Venus.
Other HiPS can be found here http://voparis-srv-paris.obspm.fr/vo/planeto/hips/.

In [10]:
aladin = Aladin(
    coo_frame="body",
    survey="http://voparis-srv-paris.obspm.fr/vo/planeto/hips/CDS_P_Venus_Magellan_C3-MDIR-2025m/"
)
aladin

Aladin(coo_frame='body', options=['allow_full_zoomout', 'coo_frame', 'fov', 'full_screen', 'log', 'overlay_sur…

### Load the MOCs

First, for each MOC, we create a MOC object with mocpy and then we convert it to a dictionnary, since it's the format that ipyaladin expects in the function add_moc_from_dict.

In [11]:
for item in data :
    mocObject = MOC.from_str(item["coverage"])
    jsonMoc = mocObject.serialize(format='json', optional_kw_dict=None, pre_v2=False)
    aladin.add_moc_from_dict(jsonMoc, {'opacity' : 0.5, 'name' : item["granule_uid"]})

We center the view on one of the MOC

In [12]:
mocCenter = MOC.from_str(data[0]["coverage"]).barycenter()
aladin.target = mocCenter.to_string()

## Union and intersection of MOCs

In this section, we are going to see how to search specific MOCs.
Here we only consider MOCs in the Northern Hemisphere, whose dataproduct_type is the spectral cube and whose observation minimim local time was before 20 P.M.

Then, we will group those MOCs in three categories, corresponding to the channel_id field associated to them.
And finally, we will search all the elements whose MOC is in a MOC defined as the intersection of those three groups.

### Get the data

The range query is used to specify a range for the given field.
For instance, in the first range query, we have: 0 $\leq$ c1min $\leq$ 360

In [13]:
query = {
    "bool": {
         "must": [
             {
                "range" : {
                    "c1min" : {
                        "gte" : 0,
                        "lte" : 360
                    }
                }
             },
             {
                "range" : {
                    "c1max" : {
                        "gte" : 0,
                        "lte" : 360
                    }
                }
             },
             {
                "range" : {
                    "c2min" : {
                        "gte" : 0,
                        "lte" : 90
                    }
                }
             },
             {
                "range" : {
                    "c2max" : {
                        "gte" : 0,
                        "lte" : 90
                    }
                }
             },
             {
                "range" : {
                    "local_time_min" : {
                        "lte" : 20
                    }
                }
             },
             {
                "term" : {
                    "dataproduct_type" : "sc"
                }
            },
            {
                "term": {
                    "service_title.keyword": {
                    "value": "vvex"
                  }
                 }
             },
             {
                "exists" : {
                    "field" : "coverage"
                }
            }
        ]
    }
}

page = es.search(
    index=indexName,
    query=query,
    size=50,
    fields=["*"],
)

data = []
for document in page["hits"]["hits"] :
    data.append(document["_source"])
    print(document["_source"]["granule_uid"])

print("\n",len(data)," results found")

VI0041_02C
VT0033_00C
VI0060_05C
VV0041_02C
VI0058_00C
VV0058_00C
VT0041_03C
VI0047_00C
VV0060_05C
VV0047_00C
VV0062_00C
VI0062_00C
VT0058_00C
VT0060_01C
VT0062_00C
VT0047_00C

 16  results found


### Group the MOCs by channel_id

Here we are using a dictionnary: each key is a channel_id value, the value associated to it is an array containing all the MOCs whose channel_id is equal to this key.
In the following block of code we are filling this dictionnary.

In [14]:
mocGroups = {"VIRTIS_M_VIS":[], "VIRTIS_M_IR":[], "VIRTIS_H":[]}
for element in data :
    mocGroups[element["channel_id"]].append(MOC.from_str(element["coverage"]))

For each channel_id we are now using mocpy to make an union of all the MOCs in the array corresponding to this channel_id.
So we add the first element to mocUnion, we remove it from the array, then for each element of the array we make an union between mocUnion and this element, and finally, we replace the array of MOCs with the MOC mocUnion at the corresponding key.

So, the following block of code converts a dictionnary of MOCs array to a dictionnary of MOCs that covers the same area as the array of MOCs that was previously at the same key.

In [15]:
for key in list(mocGroups.keys()) :
    if(len(mocGroups[key]) > 0):
        mocUnion = MOC.new_empty(29)
        for moc in mocGroups[key] :
            mocUnion = mocUnion.union(moc)
        mocGroups[key] = mocUnion
    else :
        mocGroups.pop(key)

In [16]:
aladin2 = Aladin(
    coo_frame='body',
    survey="http://voparis-srv-paris.obspm.fr/vo/planeto/hips/CDS_P_Venus_Magellan_C3-MDIR-2025m/"
)
aladin2

Aladin(coo_frame='body', options=['allow_full_zoomout', 'coo_frame', 'fov', 'full_screen', 'log', 'overlay_sur…

In [17]:
for group, moc in mocGroups.items() :
    jsonMoc = moc.serialize(format='json', optional_kw_dict=None, pre_v2=False)
    aladin2.add_moc_from_dict(jsonMoc, {'opacity' : 0.5, 'name' : group})

In [18]:
mocCenter = list(mocGroups.values())[0].barycenter()
aladin2.target = mocCenter.to_string()

### Intersection of MOCs

Here, we get the intersection of the three MOCs in mocGroups

In [19]:
mocList = []
for moc in mocGroups.values() :
    mocList.append(moc)
    
if(len(mocGroups) > 0):
    mocIntersection = mocList[0]
    mocList.pop(0)
    for moc in mocList :
        mocIntersection = mocIntersection.intersection(moc)

We can plot this intersection in Aladin if needed:

In [20]:
jsonMoc = mocIntersection.serialize(format='json', optional_kw_dict=None, pre_v2=False)
aladin2.add_moc_from_dict(jsonMoc, {'opacity' : 0.5, 'name' : 'intersection'})

In [21]:
mocCenter = mocIntersection.barycenter()
aladin2.target = mocCenter.to_string()

Then, for each item in the data that we got in the "Get the data" section, we can get the intersection between the MOC associated to this item and the MOC corresponding to the intersection of the three groups of MOCs (mocIntersection).

If this intersection is not empty, then this item is in mocIntersection.
So we get a list of items whose MOC intersects with the MOC of other items with the two other types of channel_id

In [22]:
results = []
for item in data :
    moc = MOC.from_str(item["coverage"])
    if not moc.intersection(mocIntersection).empty() :
        results.append(item)        

In [23]:
print(len(results), " results found:")
for item in results :
    print(item["granule_uid"])

15  results found:
VI0041_02C
VI0060_05C
VV0041_02C
VI0058_00C
VV0058_00C
VT0041_03C
VI0047_00C
VV0060_05C
VV0047_00C
VV0062_00C
VI0062_00C
VT0058_00C
VT0060_01C
VT0062_00C
VT0047_00C


## Select an area in Aladin and search elements within

### Import required modules

In [24]:
from astropy.coordinates import SkyCoord
import astropy.units as u
from ipywidgets import Layout, Box, widgets

### Draw a MOC with the mouse cursor

When the user click somewhere in Aladin (on the surface), the position of the click is saved and we update the ranges of c1 and c2 (longitude and latitude), that will be used for the query to avoid having to much data.

To get the moc of the selected area, click on the button 'Add selection' and you will see it in Aladin (you might need to rotate the sphere to update the view).

But before doing so, we need to add an event to the button and to the Aladin box. This will be added in the following blocks of code.

In [25]:
aladin3 = Aladin(
    layout=Layout(width="70%"),
    coo_frame='body',
    survey="http://voparis-srv-paris.obspm.fr/vo/planeto/hips/CDS_P_Venus_Magellan_C3-MDIR-2025m/"
)

button = widgets.Button(description="Add selection")

box_layout = Layout(
    display="flex", flex_flow="row", align_items="stretch", width="100%"
)
box = Box(children=[aladin3, button], layout=box_layout)
box

Box(children=(Aladin(coo_frame='body', layout=Layout(width='70%'), options=['allow_full_zoomout', 'coo_frame',…

None

In [39]:
def updateBounds(areaPoints, ra, dec) :
    if areaPoints["c1min"] is None or areaPoints["c1min"] > ra :
        areaPoints["c1min"] = ra
    if areaPoints["c1max"] is None or areaPoints["c1max"] < ra :
        areaPoints["c1max"] = ra
    if areaPoints["c2min"] is None or areaPoints["c2min"] > dec :
        areaPoints["c2min"] = dec
    if areaPoints["c2max"] is None or areaPoints["c2max"] < dec :
        areaPoints["c2max"] = dec

In [40]:
def addMocSelectionToAladin(*_) :
    if len(areaPoints) > 1 :
        polygonCoordinates = SkyCoord(areaPoints["ra"]*u.deg, areaPoints["dec"]*u.deg)
        mocSelection = MOC.from_polygon_skycoord(polygonCoordinates, max_depth=15)
        aladin3.add_moc_from_dict(mocSelection.serialize("json"), {"color": "red", "opacity": 0.5, "name": "selection"})
        areaPoints["moc"] = mocSelection

In [41]:
areaPoints =  {"moc" : None, "c1min": None, "c1max" : None, "c2min" : None, "c2max" : None, "ra" : [], "dec" : []}

def addToAreaPoints(data) :
    updateBounds(areaPoints, data["ra"],data["dec"])
    areaPoints["ra"].append(data["ra"])
    areaPoints["dec"].append(data["dec"])

And we can now add the events:

In [29]:
aladin3.add_listener("click", addToAreaPoints)
button.on_click(addMocSelectionToAladin)

### Get the data

We get all the data that may intersect with the searched area, but we limit it at 5000 results here so that it's not to slow.

The 2 following functions create range queries for ElasticSearch.
Then, we can start searching.

In [36]:
def getRange(field, min, max) :
    return {
        "range" : {
            field : {
                "gte" : min,
                "lte" : max
            }
        }
    }

In [37]:
def getC1Range(c1min, c1max) :
    range = []
    c1Dist = abs(c1min - c1max)

    if c1Dist <= 180 :
        range.append(getRange("c1min",c1min,c1max))
        range.append(getRange("c1max",c1min,c1max))
    else :
        range.append(getRange("c1min",0,c1min))
        range.append(getRange("c1min",c1max,360))

        range.append(getRange("c1max",0,c1min))
        range.append(getRange("c1max",c1max,360))

    return range

In [38]:
query = {
    "bool": {
         "must": [
             {
                 "bool" : {
                     "should" : getC1Range(areaPoints["c1min"],areaPoints["c1max"]),
                     "minimum_should_match" : 1
                 }
             },
             {
                 "bool" : {
                     "should" : [
                         getRange("c2min", areaPoints["c2min"], areaPoints["c2max"]),
                         getRange("c2max", areaPoints["c2min"], areaPoints["c2max"])
                     ],
                     "minimum_should_match" : 1
                 }
             },
             {
                "exists" : {
                    "field" : "coverage"
                }
            },
             {
                "term": {
                    "service_title.keyword": {
                    "value": "vvex"
                    }
                }
            }
        ]
    }
}

page = es.search(
    index=indexName,
    query=query,
    size=2000,
    fields=["*"],
)

data = []
for document in page["hits"]["hits"] :
    data.append(document["_source"])

print("\n",len(data)," results found")


 2000  results found


### Search MOCs that intersect with it and get the associated elements

Then, for each item in the data that we got in the "Get the data" section, we can get the intersection between the MOC associated to this item and the MOC corresponding to the searched area.

If this intersection is not empty, then this item is in the searched area.

In [33]:
results = []
if areaPoints["moc"] is None :
    print("The search area is empty. Please select an area and click on the button")
else :  
    for item in data :
        if item["coverage"] is not None and item["coverage"] != "" :
            moc = MOC.from_str(item["coverage"])
            if not moc.intersection(areaPoints["moc"]).empty() :
                results.append(item)

In [34]:
print(len(results), " results found:")
for item in results :
    print(item["granule_uid"])

69  results found:
VI0901_02C
VV0042_00G
VV0992_08C
VI0145_15C
VI0261_06C
VI0460_07G
VV1873_04G
VV0992_02C
VV0040_00C
VT0261_04G
VV1509_08C
VT0277_01C
VT0371_02G
VV1437_01G
VV0145_15C
VT0657_02C
VV1429_02C
VT0742_03G
VT0741_04G
VT0901_02C
VT0896_03G
VT1267_02C
VT1267_02G
VT0457_02C
VT0457_02G
VT0377_03G
VT0256_03C
VV1440_02G
VV0459_07G
VV2337_02G
VV0249_12C
VV2073_06C
VV2720_12C
VV2716_14G
VV2714_12C
VV0664_06G
VV0733_03G
VI0145_15G
VV1140_09G
VV1140_04C
VI0214_01C
VI0249_12G
VI0458_07G
VI0459_07G
VI0901_02G
VV2180_04G
VV1439_01G
VT0218_01C
VT0287_03G
VT0278_01C
VT0742_04C
VT0900_02C
VT0377_03C
VT0040_00C
VV0256_09C
VV2481_08C
VV0459_07C
VV2355_04G
VV2643_00C
VV1991_05G
VV1990_04G
VV1910_04C
VV1910_04G
VV2180_04C
VI0664_06C
VI0742_02C
VV2129_01C
VV0901_02G
VV0992_08G


### Load the MOCs of some of the results in Aladin

In [35]:
for i in range(min(10, len(results))) :
    moc = MOC.from_str(results[i]["coverage"])
    aladin3.add_moc_from_dict(moc.serialize("json"), {"opacity": 0.5, "name": results[i]["granule_uid"]})

### Center the view on it

In [None]:
if len(results) > 0 :
    mocCenter = MOC.from_str(results[0]["coverage"]).barycenter()
    aladin3.target = mocCenter.to_string()