# SAR Dataset Accessibility

### Work in progress notebook

There are several ways to find and access the SAR datasets. These are the datasets which contain the calibrated geophysical range Doppler frequency shift retrievals from the ENVISAT ASAR wide-swath acqusitions obtained between 2002 and 2012. In the following a description of some selected methods for finding and extracting these datasets are presented.

### Find Data Through Web Search


All data is freely available and can be found in the MET Norway thredds catalog: https://thredds.met.no/thredds/catalog.html

![Thredds Dataset Overview](../images/Thredds_Dataset_Overview_image_cropped.png)

The ENVISAT ASAR datasets are located at: https://thredds.met.no/thredds/catalog/remotesensingenvisat/asar-doppler/catalog.html

Or just following this folder structure: Observations/Remotesensing_archive/ENVISAT_ASAR_Doppler:

![ENVISAT ASAR Doppler Overview](../images/ENVISAT_ASAR_Doppler_Overview_cropped.png)

Entering the subfolder, the separate netCDF-files are found under separate pathways dependent on their respective dates. Wanting to access the files for a specific date the following structure is followed: YEAR/MONTH/DAY

Underneath the path to 2012/01/27 is shown:

![ASAR 2012 overview](../images/ASAR_2012_overview_cropped.png)

![ASAR 2012/01 overview](../images/ASAR_2012_01_overview_cropped.png)

![ASAR 2012/01/27 overview](../images/ASAR_2012_01_27_overview_cropped.png)

The entire list of files from the specified date are then accessible (the list goes on).

### Access Data

Upon accessing a specific netCDF-file four different "Access"-options are available. These are "OPENDAP", "HTTPServer", "WCS" and "WMS". 

![ASAR 2012-01-27 netCDF overview.png](../images/ASAR_2012_01_27_netCDF_overview.png)

The separate options are explained in the following. In these examples the netCDF file "ASA_WSDV2PRNMI20120127_215005_000614583111_00101_51839_0000.nc" (the uppermost file under 2012/01/27) is used as an example.

#### OPENDAP - Using xarray:

The data is easily accessed through OPENDAP by the use of the xarray python package. Below is a an example on how to use xarray to open and investigate a desired dataset.
This procedure makes it easy to inspect the Dimensions, Coordinates, Data Variables, Indexes and Attributes of the dataset in question. 

In [1]:
# Import the required package: xarray
import xarray as xr

# Providing the OPENDAP-url
OPENDAP_url = 'https://thredds.met.no/thredds/dodsC/remotesensingenvisat/asar-doppler/2012/01/27/ASA_WSDV2PRNMI20120127_215005_000612433111_00101_51839_0000.nc'

# Using xarray to open the dataset using the OPENDAP-url
ds = xr.open_dataset(OPENDAP_url)

# Investigating the data as an xarray.Dataset 
ds

### Find Data Through CSW (Catalog Service on the Web)

Having problems finding the datasets using CSW. I am able to find the datasets used as examples here https://github.com/metno/esa-coscaw-data-search, but not the Envisat ASAR data we want for this manual... (just at staging site for now I assume...) - Note for self - See csw_test.ipynb for code example!

###### Need to change the endpoint when the data is made available

In [16]:
from fadg.find_and_collocate import SearchCSW
from datetime import datetime, timedelta

############################################# Time and dt ########################################################

time_str = '2012-02-15 00:00:00' # Valid datetime string for the SearchCSW function
                                 # Default is the time right now; now = datetime.now()

time = datetime.strptime(time_str, '%Y-%m-%d %H:%M:%S')

dt = 24        # dt : float (default 24) - Total time interval in hours before and after the given time (dt is centered around the selected time)
# dt = 24*20
print(f'Finding data within the timespan of: {time - timedelta(hours=dt/2)} and {time + timedelta(hours=dt/2)}.')
print('\n')


################################################## Text ##################################################################

text = "Doppler" # This text string needs to be a part of the filename of the files to be found.

print(f'Finding data with file names containing "{text}".')
print('\n')

################################################## bbox ########################################################################

boundary_box = [34.9, 80.9, 35.1, 81]   # This boundary box only have to be intersected by the geographical extent of the desired datasets.
                                        # Default : [-180, -90, 180, 90]

print(f'Finding data intersected by the this specified boundary box: {boundary_box}.')
print('\n')

################################################ endpoint #####################################################################

endpoint = "https://csw.s-enda-staging.k8s.met.no"   # The site where the data is located 

print(f"Searching for data with endpoint set to: {endpoint}.")
print('\n')

############################################ Finding the Corresponding datasets ########################################################
# Find all SAR Doppler data dt/2 hours back in time from now (if dt and time is not specified):

# sar = SearchCSW(time = time, dt=dt, text="Doppler", endpoint="https://csw.s-enda-staging.k8s.met.no") # Worked but got sarwind data (as expected)
sar = SearchCSW(time = time, dt = dt, text = text, bbox = boundary_box, endpoint = "https://csw.s-enda-staging.k8s.met.no") # Testing the above with bbox (found the same number of files as above)

### How many files are found
if len(sar.urls) == 0:
    print('No data match the chosen credentials...')
elif len(sar.urls) == 1:
    print(f'There is {len(sar.urls)} file which match the chosen credentials!')
else:
    print(f'There are {len(sar.urls)} file which match the chosen credentials!')

print('\n')

### Provide the found URLs
sar.urls.sort()  # Gives me none, but still appears to sort the list
print('These are the Opendap-URLs of the datasets which match the chosen credentials:')
sar.urls

Finding data within the timespan of: 2012-02-14 12:00:00 and 2012-02-15 12:00:00.


Finding data with file names containing "Doppler".


Finding data intersected by the this specified boundary box: [34.9, 80.9, 35.1, 81].


Searching for data with endpoint set to: https://csw.s-enda-staging.k8s.met.no.


There are 4 file which match the chosen credentials!


These are the Opendap-URLs of the datasets which match the chosen credentials:


['https://thredds.met.no/thredds/dodsC/remotesensingenvisat/asar-doppler/2012/02/14/ASA_WSDH2PRNMI20120214_103237_000601593111_00353_52091_0000.nc',
 'https://thredds.met.no/thredds/dodsC/remotesensingenvisat/asar-doppler/2012/02/14/ASA_WSDH2PRNMI20120214_170908_000623603111_00357_52095_0000.nc',
 'https://thredds.met.no/thredds/dodsC/remotesensingenvisat/asar-doppler/2012/02/14/ASA_WSDH2PRNMI20120214_171008_000624093111_00357_52095_0000.nc',
 'https://thredds.met.no/thredds/dodsC/remotesensingenvisat/asar-doppler/2012/02/15/ASA_WSDH2PRNMI20120215_095617_000599363111_00367_52105_0000.nc']

### Get Parent Datasets and their Children (or Dataset Series in ISO 19115) with OGC CSW

MET Norway organises datasets in parent-child relationships. A parent can be a given model simulation like [Arome-Arctic deterministic](https://data.csw.met.no/?mode=opensearch&service=CSW&version=2.0.2&request=GetRecords&elementsetname=full&typenames=csw:Record&resulttype=results&q=deterministic), where the link provides the OGC CSW result of a search for "deterministic".

The same search but with results provided in ISO format: https://data.csw.met.no/csw?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetRecords&RESULTTYPE=results&TYPENAMES=csw:Record&ElementSetName=full&q=deterministic&outputschema=http://www.isotc211.org/2005/gmd.

Here, a field gmd:parentIdentifier provides the metadata identification of the parent dataset, i.e., no.met:806070da-e9f3-4d03-ba1d-26b843961634.

Get the parent dataset:

     https://data.csw.met.no/csw?service=CSW&version=2.0.2&request=GetRepositoryItem&id=no.met:806070da-e9f3-4d03-ba1d-26b843961634.

Get all its children:

     https://data.csw.met.no/csw?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetRecords&RESULTTYPE=results&TYPENAMES=csw:Record&ElementSetName=full&outputFormat=application%2Fxml&outputschema=http://www.isotc211.org/2005/gmd&CONSTRAINTLANGUAGE=CQL_TEXT&CONSTRAINT=apiso:ParentIdentifier%20like%20%27no.met:806070da-e9f3-4d03-ba1d-26b843961634%27.

To find all parent datasets:

     https://csw.s-enda-staging.k8s.met.no/csw?SERVICE=CSW&VERSION=2.0.2&REQUEST=GetRecords&RESULTTYPE=results&TYPENAMES=csw:Record&ElementSetName=full&outputschema=http://www.isotc211.org/2005/gmd&CONSTRAINTLANGUAGE=CQL_TEXT&CONSTRAINT=dc:type%20like%20%27series%27.


#### Find Data with OpenSearch

OpenSearch is a way for websites and search engines to publish search results in a standard and accessible format.

To find all datasets in the catalogue:

    https://data.csw.met.no/?mode=opensearch&service=CSW&version=2.0.2&request=GetRecords&elementsetname=full&typenames=csw:Record&resulttype=results

Or datasets within a given time span:

    http://data.csw.met.no/?mode=opensearch&service=CSW&version=2.0.2&request=GetRecords&elementsetname=full&typenames=csw:Record&resulttype=results&time=2000-01-01/2020-09-01

Or datasets within a geographical domain (defined as a box with parameters min_longitude, min_latitude, max_longitude, max_latitude):

    https://data.csw.met.no/?mode=opensearch&service=CSW&version=2.0.2&request=GetRecords&elementsetname=full&typenames=csw:Record&resulttype=results&bbox=0,40,10,60

Or, datasets with "arome-arctic 2.5Km deterministic" in the title:

    https://data.csw.met.no/?mode=opensearch&service=CSW&version=2.0.2&request=GetRecords&elementsetname=full&typenames=csw:Record&resulttype=results&q=arome-arctic\\%202.5Km\\%20deterministic



### More Advanced Geographical Search with OGC CSW

PyCSW opensearch only supports geographical searches querying for a box. For more advanced geographical searches, one must write specific XML files. For example:

* To find all datasets containing a point (my_xml_request_containing_a_point.xml):

```xml
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<csw:GetRecords
    xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"
    xmlns:ogc="http://www.opengis.net/ogc"
    xmlns:gml="http://www.opengis.net/gml"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    service="CSW"
    version="2.0.2"
    resultType="results"
    maxRecords="10"
    outputFormat="application/xml"
    outputSchema="http://www.opengis.net/cat/csw/2.0.2"
    xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" >
  <csw:Query typeNames="csw:Record">
    <csw:ElementSetName>full</csw:ElementSetName>
    <csw:Constraint version="1.1.0">
      <ogc:Filter>
        <ogc:Contains>
          <ogc:PropertyName>ows:BoundingBox</ogc:PropertyName>
          <gml:Point>
            <gml:pos srsDimension="2">59.0 4.0</gml:pos>
          </gml:Point>
        </ogc:Contains>
      </ogc:Filter>
    </csw:Constraint>
  </csw:Query>
</csw:GetRecords>
```

* To find all datasets intersecting a polygon (my_xml_request_intersecting_a_polygon.xml):

```xml
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<csw:GetRecords
    xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"
    xmlns:gml="http://www.opengis.net/gml"
    xmlns:ogc="http://www.opengis.net/ogc"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    service="CSW"
    version="2.0.2"
    resultType="results"
    maxRecords="10"
    outputFormat="application/xml"
    outputSchema="http://www.opengis.net/cat/csw/2.0.2"
    xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" >
  <csw:Query typeNames="csw:Record">
    <csw:ElementSetName>full</csw:ElementSetName>
    <csw:Constraint version="1.1.0">
      <ogc:Filter>
        <ogc:Intersects>
          <ogc:PropertyName>ows:BoundingBox</ogc:PropertyName>
          <gml:Polygon>
            <gml:exterior>
              <gml:LinearRing>
                <gml:posList>
                  47.00 -5.00 55.00 -5.00 55.00 20.00 47.00 20.00 47.00 -5.00
                </gml:posList>
              </gml:LinearRing>
            </gml:exterior>
          </gml:Polygon>
        </ogc:Intersects>
      </ogc:Filter>
    </csw:Constraint>
  </csw:Query>
</csw:GetRecords>
```

* To find all datasets intersecting a polygon within a given time span (my_xml_request_intersecting_a_polygon_within_a_given_time_span.xml):

```xml
<?xml version="1.0" encoding="ISO-8859-1" standalone="no"?>
<csw:GetRecords
    xmlns:csw="http://www.opengis.net/cat/csw/2.0.2"
    xmlns:gml="http://www.opengis.net/gml"
    xmlns:ogc="http://www.opengis.net/ogc"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    service="CSW"
    version="2.0.2"
    resultType="results"
    maxRecords="100"
    outputFormat="application/xml"
    outputSchema="http://www.opengis.net/cat/csw/2.0.2"
    xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd" >
  <csw:Query typeNames="csw:Record">
    <csw:ElementSetName>summary</csw:ElementSetName>
    <csw:Constraint version="1.1.0">
      <ogc:Filter>
        <ogc:And>
          <ogc:Intersects>
            <ogc:PropertyName>ows:BoundingBox</ogc:PropertyName>
            <gml:Polygon>
              <gml:exterior>
                <gml:LinearRing>
                  <gml:posList>
                    63.3984 7.65173 60.7546 5.0449 59.0639 10.187 62.9065 12.4944 63.3984 7.65173
                  </gml:posList>
                </gml:LinearRing>
              </gml:exterior>
            </gml:Polygon>
          </ogc:Intersects>
          <ogc:PropertyIsGreaterThanOrEqualTo>
            <ogc:PropertyName>apiso:TempExtent_begin</ogc:PropertyName>
            <ogc:Literal>2022-03-01 00:00</ogc:Literal>
          </ogc:PropertyIsGreaterThanOrEqualTo>
          <ogc:PropertyIsLessThanOrEqualTo>
            <ogc:PropertyName>apiso:TempExtent_end</ogc:PropertyName>
            <ogc:Literal>2023-03-08 00:00</ogc:Literal>
          </ogc:PropertyIsLessThanOrEqualTo>
        </ogc:And>
      </ogc:Filter>
    </csw:Constraint>
  </csw:Query>
</csw:GetRecords>
```

* Then, you can query the CSW endpoint and print the response text using, e.g., python:

In [2]:
import requests
import xarray as xr
import re
import sys

# Define the headers
headers = {'Content-Type': 'application/xml'}

# Specify the xml-file that should be used for the search
# my_xml_request = 'my_xml_request_containing_a_point.xml'
my_xml_request = 'my_xml_request_intersecting_a_polygon.xml'
# my_xml_request = 'my_xml_request_intersecting_a_polygon_within_a_given_time_span.xml'

# Open and read the XML file
with open(my_xml_request, 'r') as file:
    xml_data = file.read()

# Send the POST request
response = requests.post('https://data.csw.met.no', data=xml_data, headers=headers)

# The response text
print(response.text)

# Extract the OPENDAP urls

# The pattern 'https.*?\.nc(?:ml)?' is "https://thredds.met.no/thredds/dodsC/{regardless_of_what_is_in_between}.ncml" 
# where the "ml" ending is inculded only if found.
my_pattern= r'https://thredds.met.no/thredds/dodsC/.*?\.nc(?:ml)?'


# findall() function returns all non-overlapping matches of my_pattern in string, as a list of strings
opendap_urls = re.findall(my_pattern, response.text)

# List of OPENDAP urls
print(f'List contains {len(opendap_urls)} urls:')
print(opendap_urls)

# Check if there are any files - Statement if not
if len(opendap_urls) > 0:

    # Open the first dataset in the list of urls
    ds = xr.open_dataset(opendap_urls[0])
    
else:
    ds = "No file(s) match the search criterias."

ds

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- pycsw 2.7.dev0 -->
<csw:GetRecordsResponse xmlns:csw="http://www.opengis.net/cat/csw/2.0.2" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dct="http://purl.org/dc/terms/" xmlns:gmd="http://www.isotc211.org/2005/gmd" xmlns:gml="http://www.opengis.net/gml" xmlns:ows="http://www.opengis.net/ows" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="2.0.2" xsi:schemaLocation="http://www.opengis.net/cat/csw/2.0.2 http://schemas.opengis.net/csw/2.0.2/CSW-discovery.xsd"><csw:SearchStatus timestamp="2024-10-10T06:46:32Z"/><csw:SearchResults numberOfRecordsMatched="212039" numberOfRecordsReturned="10" nextRecord="11" recordSchema="http://www.opengis.net/cat/csw/2.0.2" elementSet="full"><csw:Record><dc:identifier>no.met:31ac1963-074a-43a4-b1b3-7423e8307827</dc:identifier><dc:title>Meps 2.5 km surface parameters from ensemble member 6 2024-05-10T20:00:00Z + 66 hours</dc:title><dc:ty

### How to Visualize Data with WMS (Web Map Service)

#### By the Use of data.met.no

By using data.met.no it is possible to both find and visualise datasets. The web search interface can be accessed from the "Data Catalog" menu item, or directly at https://data.met.no/metsis/search (BUT I CANT FIND THE ENVISAT ASAR DATA HERE... Assuming that these are not available here as of yet (only on staging page?)). As seen below the search interface consists of a map and a series of filters.

![Data Catalog Overview](../images/DataMetNo_Data_Catalog_Overview_image.png)

The map provides a pagination of available datasets in the metadata catalog [max/min longitude/latitude rectangle], sorted to showcase the latest additions first. One can also interact with the map to better diplay the results, and to perform data search.

* "Select Projection" located just above the map can be altered to change the map projection. "Spatial filter" can be set to both "Within" and "Intersects".
* The "Create bounding box"-button enables to set a bounding box directly on the map and works as a filter on the results.
* The "Reset Search"-button clears the filters and starts a new search.
* The "Reset Map"-button resets the map.

Map widgets allows direct interaction with the map:

* +/-:                     Zoom in/out.
* E:                       Zooms to the extent of the displayed datasets.
* Menu tag:                Opens side panel where WMS Layers, Features and Base Layers can be altered.
* Magnifying glass:        Enables searching for location names.
* '>>':                    Showing the location in an overview world map.
* Upper right hand widget: Full screen mode

Search filters can also be used to find the desired datasets. The results are updated dynamically when filters are selected. These allows:

* A full text search block where the options "Contains all of these words" and "Contains any of these words" are eligible.
* Start and end date of the desired datasets.
* An option named "Has children" to determine whethere datasets are parents with children (i.e. records of the same type).
* The desired sorting mechanism.
* Isotopic categories: The general subjects for which the geospatial data may be relevant, as defined by the ISO standard.
* Keywords: Keywords from a controlled vocabulary.
* Activity type: The nature of the dataset(s) generation process (Numerical Simulation, Climate Indicator, In Situ Land-based station, Space Borne Instrument).
* Project: Datsets related to a certain project.

By clicking the "Reset"-button all filters are removed and a new search can be initiated.




#### By the Use of QGIS

As the MET Norway's S-ENDA CSW catalog service is also available through QGIS, series/datasets can be found and inspected as follows:

1. First open Qgis and select a map, i.e. the OpenStreetMap:

    ![QGIS startup](../images/QGIS_png/QGIS_startup.png)
    
    <br />
    <br />

2. From the menu select "Web > MetaSearch > MetaSearch".

    ![QGIS Web MetaSearch MetaSearch](../images/QGIS_png/QGIS_Web_MetaSearch_MetaSearch.png)
    
    <br />
    <br />

3. Select "Services > New" to open the "New Catalog Service".

    ![QGIS Services New](../images/QGIS_png/QGIS_Services_New.png)

    <br />
    <br />
4. For the "Name" type "csw.s-enda-staging.k8s.met.no". As for the "URl", type "https://csw.s-enda-staging.k8s.met.no". By then clicking "Ok" the required server is added.

    ![QGIS New Catalog Service](../images/QGIS_png/QGIS_New_Catalog_Service.png)

    <br />
    <br />
5. Without exiting "MetaSearch", move back to the "Search" tab. Now the server that was just added is selected in the "From"-menu.

    ![QGIS MetaSearch ready](../images/QGIS_png/QGIS_MetaSearch_ready.png)

    <br />
    <br />

6. To get a list of the available series/datasets there is the option to add different search parameters under the "Search" tab. Adding keywords will single out the series and datasets with these as part of their "Title". To find the "_calibrated geophysical ENVISAT ASAR wide-swath range frequency shift retrievals_" series/datasets the sequence in italics can be provided into the "Keywords" search tab, but "ENVISAT ASAR" or "Doppler" will also suffice. To actually search for datasets klick the "Search" option. The series/datasets will then show up in the "Results" section.

    ![QGIS METASearch Keywords added](../images/QGIS_png/QGIS_METASearch_Keywords_added.png)
    
    <br />
    <br />

9. When a search is made, the results can alternatively be displayed as a scrollable list of XMLs. This is easily done by clicking "View Search Results as XML" in the "MetaSearch" window. This will open a new window, namely "XML Request / Response". Here the resulting series/datasets from the search are displayed as XML.

    ![QGIS View Search Results as XML](../images/QGIS_png/QGIS_View_Search_Results_as_XML.png)
    
    <br />
    <br />

7. Moving back to the MetaSearch window the possibility to quickly display the geographical extent of selected series/dataset is available. By klicking one of series/datasets a red bounding box will pop up on the map highligting the geographical extent of said dataset.    

    ![QGIS MetaSearch random dataset selected](../images/QGIS_png/QGIS_MetaSearch_random_dataset_selected.png)
    
    <br />
    <br />

    ![QGIS Bounding Box of randomly selected dataset](../images/QGIS_png/QGIS_Bounding_Box_of_randomly_selected_dataset.png)
    
    <br />
    <br />

8. To further display the full record information alongside adherent links, double klick the selected series/dataset. A new window named "Record Metadata" will then be opened. Here the Record Metadata and the adherent links of the selected series/dataset are then displayed.

    ![QGIS MetaSearch random dataset selected to view Record Metadata](../images/QGIS_png/QGIS_MetaSearch_random_dataset_selected_to_view_Record_Metadata.png)
    
    <br />
    <br />

    ![QGIS Record Metadata display](../images/QGIS_png/QGIS_Record_Metadata_display.png)
    
    <br />
    <br />

7. If the exact date and time of the desired dataset is known, this can be also added alongside keywords as "ENVISAT ASAR" or "Doppler" in the MetaSearch. This will single out this specific dataset.

    ![QGIS MetaSearch specific date](../images/QGIS_png/QGIS_MetaSearch_specific_date.png)
    
    <br />
    <br />

    ![QGIS Bounding Box specific date selected](../images/QGIS_png/QGIS_Bounding_Box_specific_date_selected.png)
    
    <br />
    <br />

8. There is also the possibility to alter the bounding box of the desired datasets. This box is altered by altering the latitude and longitude values found within the "Ymax/min" and "Xmax/min" search tabs, respectively. To reset these quickly to global default settings click "Set Global". Clicking "Map Extent" will limit the bounding box to the extent of the map.

    ![QGIS MetaSearch specific geographical extent selected](../images/QGIS_png/QGIS_MetaSearch_specific_geographical_extent_selected.png)
    
    <br />
    <br />

    ![QGIS_Bounding Box specific geographical extent selected](../images/QGIS_png/QGIS_Bounding_Box_specific_geographical_extent_selected.png)
    
    <br />
    <br />

Following the points above should provide an easy and efficient way of displaying and finding desired series/datasets.
