# Building and Handling Textual MOCs

The notebook is associated to the submitted paper **Encapsulating Textual Contents into a MOC data structure for Advanced Applications**.
The notebook outlines the basic functionalities of a new approach that integrates textual descriptions directly into the JSON representation of MOC, enabling simultaneous semantic and spatial operations. After demonstrating some basic applications and its potential use for educational gamification, we will later showcase its applicative capabilities in generative AI (GenAI).

The tutorials are organized in the following folders
1. [tuto1_TextualMOC](https://github.com/ggreco77/TextualMOC/tree/main/tuto1_TextualMOC) basic application to build a Textual MOC
2. [AladinGame](https://github.com/ggreco77/TextualMOC/tree/main/AladinGame) using Text MOC for EDU game in Aladin Lite
3. [tuto2_TextualMOC](https://github.com/ggreco77/TextualMOC/tree/main/tuto2_SemanticMOC) Creating Semantic MOC for application in Generative AI systems

#### Version 0.0.7 - September 2025

This notebook is divided into the following sections.

1. [**Basic Methods for Handling Textual MOCs**](#Basic-Methods-for-Handling-Textual-MOCs)  
    - [Creating a Textual MOC](#Creating-a-Textual-MOC)  
    - [Loading a Textual MOC](#Loading-a-Textual-MOC)  
    - [Creating Textual MOC from MocServer](#Creating-Textual-MOC-from-MocServer)  
    - [Gravitationa-wave sky localization area and GCN Circular](#Gravitationa-wave-sky-localization-area-and-GCN-Circular)
    - [Annotating text within a specific MOC cell](#Annotating-text-within-a-specific-MOC-cell)
 

# Basic Methods for Handling Textual MOCs

 
 Here are some basic applications of the **Textual MOC**, which enhances ordinary MOCs by encapsulating textual content. The `TextualMOC` class is designed to interact with a Multi-Order Coverage (MOC) object, enabling serialization, modification, and extension of MOC data with additional textual descriptions and image. The `__init__` method initializes the TextualMOC class with an optional MOC object. If a MOC object is provided, it is serialized into JSON format. Additionally, an `ipyaladin` widget is initialized for later use in visualizing the MOC.

For using methods that transform textual content into semantic embeddings, we recommend installing and running Ollama - https://ollama.com/.

**The complete list of methods is provided below**.

In [None]:
# Importing the necessary libraries

import json
import os
from datetime import datetime
import requests
#from copy import deepcopy

import numpy as np

from IPython.display import display
import ipywidgets as widgets

from ipyaladin import Aladin

from mocpy import MOC
import healpy as hp

import matplotlib.pyplot as plt

import astropy.units as u

from bs4 import BeautifulSoup

from langchain.embeddings import OllamaEmbeddings

from sklearn.metrics.pairwise import cosine_similarity
from sklearn.decomposition import PCA
from sklearn.manifold import TSNE

**While we wait for an official library for textual MOCs,  we import some of the main methods required for the tutorial to work.**

In [None]:
import requests
from pathlib import Path

url = "https://raw.githubusercontent.com/ggreco77/TextualMOC/refs/heads/main/textualmoc/textual_moc.py"
dest = Path("textual_moc.py")  # Change the path/name if you want

if dest.exists():
    print(f"{dest} already exists; skipping download.")
else:
    with requests.get(url, stream=True, timeout=30) as r:
        r.raise_for_status()
        with open(dest, "wb") as f:
            for chunk in r.iter_content(chunk_size=8192):
                if chunk:
                    f.write(chunk)
    print("Saved to:", dest.resolve())

# importing TextualMOC class testing
from textual_moc import TextualMOC

In [None]:
# List of Methods in TextualMOC Class¶

import pandas as pd
from IPython.display import display, Markdown

methods = [
    {
        "Method": "add_text_media_image",
        "Description": "Adds text, media and image to `TextualMOC` by reading from a file or a URL."
    },
    {
        "Method": "annotate_cell",
        "Description": "Assigns a textual annotation to a specific MOC cell within the JSON data structure."
        },
    {
        "Method": "embedding_from_custom_text",
        "Description": "Generates an embedding of the text using a specified service and model."
    },
    {
        "Method": "load_textual_moc",
        "Description": "Loads an instance of `TextualMOC` from a JSON file."
    },
    {
        "Method": "plot_moc_area",
        "Description": "Visualizes the MOC area using matplotlib."
    },
    {
        "Method": "show_image_value",
        "Description": "Prints the image URL stored in the MOC data as a clickable link."
    },
    {
        "Method": "show_media_value",
        "Description": "Prints the multimedia URL stored in the MOC data."
    },
    {
        "Method": "show_metadata_value",
        "Description": "Prints metadata information such as author, date, and last text update."
    },
    {
        "Method": "show_text_value",
        "Description": "Prints the custom text stored in the MOC data."
    },
    {
        "Method": "render",
        "Description": "Loads the MOC from a JSON file and displays text, media, MOC area, metadata, image and embedding if present."
    },
    {
        "Method": "render_ipyaladin",
        "Description": "Displays the MOC in an Aladin viewer with defined colors, transparency, and HiPS."
    },
    {
        "Method": "save",
        "Description": "Saves the current state of `TextualMOC` in JSON format."
    },
    {
        "Method": "update_metadata",
        "Description": "Updates metadata such as author and date in the MOC's JSON data."
    },
    {
        "Method": "update_text_inline",
        "Description": "Appends new text to the custom text stored in the MOC's JSON data."
    },
#    {
#        "Method": "union",
#        "Description": "Merges the current MOC instance with another instance of `TextualMOC`."
#    }
]

# Create the DataFrame
df_methods = pd.DataFrame(methods)

# Sort the DataFrame alphabetically by the method name
df_methods = df_methods.sort_values(by="Method").reset_index(drop=True)

# Adjust the index to start at 1 instead of 0
df_methods.index = df_methods.index + 1
df_methods.index.name = 'No.'

# Prevent pandas from truncating the descriptions
pd.set_option('display.max_colwidth', None)

# Define the title
title = "# List of Methods in `TextualMOC`"

# Display the title and the table
display(Markdown(title))
display(df_methods)


## Creating a Textual MOC

The example demonstrates how to create and interact with a "Textual Multi-Order Coverage" (`Textual MOC`) object using the `mocpy` library. It begins by defining a MOC from a circle, using a [STC-S string](https://ivoa.net/documents/STC-S/20130917/WD-STC-S-1.0-20130917.html). This MOC is then converted into a `TextualMOC`, a specialized type of MOC that can contain both text and multimedia content.

Next, the `TextualMOC` is enriched by adding text from a local file and a multimedia link (a YouTube video). The metadata is updated to include the author's name ("GG") and additional custom text describing the approach used. This enhanced `TextualMOC` is saved as a JSON file for easy storage or sharing.

Finally, the code loads and simultaneously visualizes the `TextualMOC` using `matplotlib`, or `ipyladin` displaying the textual content, the sky area, multimedia links, and metadata in a single interactive view. 

This example shows how MOCs can be leveraged for advanced applications, integrating both textual and  textual content within a data visualization and analysis framework.

### Additional Notes

> **Ensure:** The local text file (`textual_content_example.txt`) is available in the same directory or provide the correct path. Otherwise, you can run the cell below to create this text file.

In [None]:
# Creating the text file: textual_content_example.txt

def write_text_file():
    """
    Creates a text file and writes sample content into it. The file 'textual_content_example.txt' 
    is created and saved in the current working directory.
    """

    content = (
        "This is a sample text added to a spatial MOC in JSON format. "
        "The text demonstrates how additional information can be embedded into a Multi-Order Coverage (MOC) object, "
        "enhancing its descriptive capabilities by including custom textual content alongside the spatial data."
    )
    
    with open("textual_content_example.txt", "w", encoding="utf-8") as file:
        file.write(content)

# Execute the function to create the file
write_text_file()

In [None]:
# Example usage 1: Creating a Textual MOC from a MOC object and interacting with it

# Create a MOC from a STC-S string.
moc = MOC.from_stcs("Circle ICRS 269.4042 -29.0078 10", max_depth=10)

# Initialize TextualMOC class with the created MOC
textual_moc = TextualMOC(moc)

# Path to the local text file
text_file_path = 'textual_content_example.txt'

# Setting Textual MOC name
textual_moc_example = 'textual_moc_example.json'

# Adding Multimedia link
multimedia_url = 'https://www.youtube.com/watch?v=903csY9kstk'

# Adding text and multimedia into the MOC
textual_moc.add_text_media_image(text_file_path, multimedia_url)

# Updating metadata with author and date
textual_moc.update_metadata(author="GG")

# Appending custom text inline
textual_moc.update_text_inline("This approach is describe in the paper 'Encapsulating Textual Content into "
                                 "MOC data structure for Advanced Applications'.")

# Saving in an external file the Textual MOC (with encapsulated text, multimedia, and metadata) in JSON format
textual_moc.save(textual_moc_example)

# Loading and displaying the Textual MOC.
textual_moc.render(textual_moc_example, show_text=True, show_area=True, show_multimedia=True, show_metadata=True, show_image=True)

In [None]:
# Display the MOC in the Aladin viewer with widgets

# Run twice the cell for showing the MOC
textual_moc.render_ipyaladin(fov=180)

## Loading a Textual MOC

This code demonstrates how to load and interact with an existing "Textual Multi-Order Coverage" (Textual MOC) object from an external file. It starts by initializing a new `TextualMOC` object without any initial MOC data. The `TextualMOC` is then loaded from a pre-existing JSON file (`textual_moc_example.json`), which contains the textual content, multimedia links, metadata, and, information about the sky area.


After loading the `TextualMOC`, several methods are available to interact with its contents. The` print_text` method outputs the encapsulated text, while `print_media displays` any associated multimedia links, and `print_metadata` provides the metadata details. Furthermore, the `plot_moc_area` method utilizes `matplotlib` to visualize the sky area defined by the MOC.

This example illustrates how to effectively load and explore a `TextualMOC` object, providing insights into its content, multimedia elements, and metadata while simultaneously visualizing its spatial coverage.

### Additional Notes

> **Ensure:** The file `textual_moc_example.json` exists in the same directory, or provide the correct path to load the Textual MOC successfully. The file `textual_moc_example.json`, useful for this example, can be created in the previous section.


In [None]:
# Example usage 2: Loading an existing textual MOC from an external file and interacting with it.

# Initialize a new TextualMOC without an initial MOC object
textual_moc_loaded = TextualMOC()

# Load the entire textual MOC from an existing file
textual_moc_loaded.load_textual_moc('textual_moc_example.json')

# Use the methods... 
textual_moc_loaded.show_text_value()
textual_moc_loaded.show_media_value()
textual_moc_loaded.show_metadata_value()
textual_moc_loaded.show_image_value()
textual_moc_loaded.plot_moc_area()

# ... or in one shot
textual_moc_loaded.render('textual_moc_example.json', show_text=True, show_area=True, show_multimedia=True, show_metadata=True)

In [None]:
# Display the MOC in the Aladin viewer with widgets

# Run twice the cell for showing the MOC
textual_moc_loaded.render_ipyaladin(fov=180)

## Creating Textual MOC from MocServer

In this exercise, you'll use the [CDS MOCServer](https://alasky.cds.unistra.fr/MocServer/query), a dedicated web service that provides access to Multi-Order Coverage (MOC) maps representing sky regions observed by various datasets. You'll benefit from high-quality resources and references, enabling you to enrich these MOCs with custom text and multimedia, enhancing their value and informativeness for astronomical research and analysis.

The CDS MOCServer is a directory service for astronomical resources that quickly provides a list of relevant resources (in just a few milliseconds) based on spatial and/or temporal coverage of the sky. Developed by the Strasbourg Astronomical Data Centre (CDS) in 2015, it integrates with several CDS tools, including [Aladin Desktop](https://aladin.cds.unistra.fr/AladinDesktop/), [Aladin Lite](https://aladin.cds.unistra.fr/AladinLite/), [Simbad](https://simbad.cds.unistra.fr/simbad/), and the [CDS portal](http://cdsportal.u-strasbg.fr/). As of 2023, the CDS MOCServer hosts 32,000 entries and is remotely accessible via an HTTP API. For Python users, a library is available through the [astroquery] package, specifically the [astroquery.mocserver](https://astroquery.readthedocs.io/en/latest/mocserver/mocserver.html) module.


The example uses the data from [The Euclid Q1 data release](https://esdcdoi.esac.esa.int/doi/html/data/astronomy/euclid/eqrq1.html)


In [None]:
import tempfile
from astroquery.mocserver import MOCServer

# Retrieving the MOC of a specific dataset: CDS/P/Euclid/Q1/color
moc_from_mocserver = MOCServer.find_datasets(meta_data="ID=CDS/P/Euclid/Q1/color", return_moc=True)

# Inspecting metadata - obs_description
metadata = MOCServer.find_datasets(meta_data="ID=CDS/P/Euclid/Q1/color")
obs_description = metadata["obs_description"][0]

#obs_description = obs_description.replace("", "≈65 deg²")

# Initialize TextualMOC class with the created MOC
textual_moc = TextualMOC(moc_from_mocserver)

# Multimedia link
multimedia_url = 'https://esdcdoi.esac.esa.int/doi/html/data/astronomy/euclid/eqrq1.html'

# Create a temporary file and write the obs_description to it
with tempfile.NamedTemporaryFile(delete=False, mode='w', suffix='.txt',) as temp_file:
    temp_file.write(obs_description)
    temp_file_name = temp_file.name  # Get the temporary file name

# Adding text and multimedia using the temporary file name
textual_moc.add_text_media_image(temp_file_name, multimedia_url )

In [None]:
textual_moc.show_text_value()

In [None]:
# Display the MOC in the Aladin viewer with widgets

# Run twice the cell for showing the MOC
textual_moc.render_ipyaladin(survey="CDS/P/Euclid/Q1/color", fov=180)

## Gravitationa-wave sky localization area and GCN Circular

To process a GCN Circular from the LIGO, Virgo, and KAGRA collaboration, we start by retrieving the GCN Circular, which contains details about a gravitational wave event. Next, we fetch the skymap associated with this circular. The skymap provides a probability distribution of the source's location on the sky.

We then compute the 90% credible region from the skymap, representing the most likely area where the source might be found. This credible region is converted into a Multi-Order Coverage (MOC) map format, which allows for efficient spatial indexing and manipulation - see Multi [Order Coverage data structure to plan multi-messenger observations](https://www.sciencedirect.com/science/article/pii/S2213133722000026).

To enrich the MOC, we include the original GCN Circular's content and link it to the associated online resources. Finally, we display the enriched Textual MOC, which now contains both the computed 90% credible region and relevant contextual information from the GCN Circular. This provides a comprehensive view, combining probabilistic spatial data with the supporting documentation and resources.

As an example, we will use [GCN Circular number 37447](https://gcn.nasa.gov/circulars/37447?query=S240910ci).
### Additional Notes
> The text of the GCN is retrieved using the site's functionality, which allows obtaining it in both **text** and **JSON** formats - making it easier to use this new feature.

### Expanding Textual MOCs to Neutrino and GRB Event Localization
> The extensive use of skymaps in the MOC-HEALPix format enables the use of Textual MOCs also in the context of neutrino localization and GRB events.

In [None]:
# Downloading sky localization from GraceDB
!curl -O https://gracedb.ligo.org/api/superevents/S240910ci/files/Bilby.multiorder.fits

# Creating 90% cr in MOC data structure
moc_gw = MOC.from_multiordermap_fits_file("Bilby.multiorder.fits", cumul_from=0.0, cumul_to=0.9)

# Initialize TextualMOC class with the created MOC
textual_moc_gw = TextualMOC(moc_gw)

# Multimedia link -
multimedia_url = 'https://gcn.nasa.gov/circulars/37447?query=S240910ci'

# Adding text and multimedia into the MOC
textual_moc_gw.add_text_media_image("https://gcn.nasa.gov/circulars/37447.txt", multimedia_url)

In [None]:
# Save in an external file the Textual MOC
textual_moc_gw.save("textual_moc_gw.json")

# Load and display the Textual MOC.
textual_moc_gw.render("textual_moc_gw.json", show_text=True, show_area=True, show_multimedia=True, show_metadata=True, show_image=True)

In [None]:
# Display the MOC in the Aladin viewer with widgets

# Run twice the cell for showing the MOC
textual_moc_gw.render_ipyaladin(survey="P/Mellinger/color", fov=180)

## Annotating text within a specific MOC cell

Up to this point, we have simply annotated a text across the entire MOC coverage; here we develop functions that enable the creation of labels in specific MOC cells chosen by the user.

In [None]:
# Loading an existing textual MOC from an external file AND Annotating text within a specific MOC cell

# Initialize a new TextualMOC without an initial MOC object
textual_moc_with_cell_note = TextualMOC()

# Load a textual MOC from an existing file
textual_moc_with_cell_note.load_textual_moc('textual_moc_example.json')

# Annotating text within a specific MOC cell
textual_moc_with_cell_note.annotate_cell(3, 451, "cell annotation!")

# Plotting moc with cell annatation
textual_moc_with_cell_note.plot_moc_area()

# Save the textual MOC with a cell with label 
textual_moc_with_cell_note.save("annotated_moc.json")