<a href="https://colab.research.google.com/github/pablog765/Prueba_Conda/blob/main/03_Programatic_Access_Colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Introduction**

In this tutorial, you will learn how to access retrospective and forecast model data from the GEOGloWS API using Python. We will cover how to send a request to the API, retrieve the data in CSV format, and save it to a local file for analysis. This is useful for obtaining historical hydrological data for a specific location.

# **Retrospective Simulation**




## **Version 1: Using `requests` and `pandas`**

Below is a Python script that retrieves retrospective model data for a specified COMID (a unique identifier for a location in the GEOGloWS model) and saves the data to a CSV file.

*   `comid` represents the unique identifier for the location you are interested in. You should replace `760684821` with your own COMID as needed.
*   `url` is the API endpoint for the retrospective data request. The COMID is included in the URL to specify which location’s data you want to retrieve.
*   `params` defines the query parameters for the API request. Here, `"format": "csv"` specifies that we want the data in CSV format, and `"start_date": "19400101"` sets the starting date for the data retrieval.
*   `headers` specifies that we accept the response in CSV format.
*   We use `requests.get` to send a GET request to the API endpoint with the specified URL, headers, and parameters.
*   We check if the request was successful by verifying if the HTTP status code is `200`.
*   If successful, the response content is read into a `pandas` DataFrame.
*   The DataFrame is then saved to a CSV file named using the COMID.
*   If the request fails, we print an error message along with the status code and response text for debugging purposes.

```python
import io
import requests
import pandas as pd

comid = 760684821

# API endpoint and parameters
url = "https://geoglows.ecmwf.int/api/v2/retrospective/{}".format(comid)
params = {
    "format": "csv",
    "start_date": "19400101"
}

# Headers
headers = {
    "accept": "text/csv"
}

# Make the GET request
response = requests.get(url, headers=headers, params=params)

# Check if the request was successful
if response.status_code == 200:
    # Convert the response content to a DataFrame
    data = pd.read_csv(io.StringIO(response.content.decode('utf-8')))

    # Save the DataFrame to a CSV file
    data.to_csv("{}_retrospective_data.csv".format(comid), index=False)
    
    print("Data has been saved to {}_retrospective_data.csv".format(comid))

else:
    print(f"Failed to retrieve data. HTTP Status code: {response.status_code}")
    print(response.text)
```

In [None]:
import io
import requests
import pandas as pd

comid = 760684821

# API endpoint and parameters
url = "https://geoglows.ecmwf.int/api/v2/retrospective/{}".format(comid)
params = {
    "format": "csv",
    "start_date": "19400101"
}

# Headers
headers = {
    "accept": "text/csv"
}

# Make the GET request
response = requests.get(url, headers=headers, params=params)

# Check if the request was successful
if response.status_code == 200:
    # Convert the response content to a DataFrame
    data = pd.read_csv(io.StringIO(response.content.decode('utf-8')))

    # Save the DataFrame to a CSV file
    data.to_csv("{}_retrospective_data.csv".format(comid), index=False)

    print("Data has been saved to {}_retrospective_data.csv".format(comid))

else:
    print(f"Failed to retrieve data. HTTP Status code: {response.status_code}")
    print(response.text)

## **Version 2: Using the `geoglows` package**

### The `geoglows.data.retrospective` Function

To obtain the retrospective simulation for any river segment, you need the corresponding `comid`.

The function `geoglows.data.retrospective` retrieves the retrospective simulation of streamflow for a given `river_id` from the AWS Open Data Program GEOGLOWS V2 S3 bucket. Here’s how to use it:

- **Arguments**:
  - `river_id` (int): the ID of a stream, should be a 9-digit integer.
  - `format` (str): the format to return the data, either 'df' for DataFrame or 'xarray'. This argument is optional and defaults to 'df'.
  
- **Returns**:
  - `pd.DataFrame`: This contains the streamflow data.

### The `geoglows.plots.retrospective` Function
This function is used for plotting data from historical simulations and return periods.

- **Parameters**:
- `df` (required): A DataFrame containing the historical simulation data.
- `plot_type` (str, optional): The type of plot to create, defaults to 'plotly'.
- `rp_df` (optional): A DataFrame containing the return periods. This parameter is optional.
- `plot_titles` (optional): A text string or list of strings for the titles of the graphs, which is also optional.

Example of usage:

```python
!pip install geoglows

import geoglows
import pandas as pd
import datetime as dt

comid = 760684821

simulated_df = geoglows.data.retrospective(comid)
simulated_df.index = pd.to_datetime(simulated_df.index)
simulated_df[simulated_df < 0] = 0
simulated_df.index = simulated_df.index.to_series().dt.strftime("%Y-%m-%d")
simulated_df.index = pd.to_datetime(simulated_df.index)

simulated_df.to_csv("{}_retroespective_data.csv".format(comid))

hydroviewer_figure = geoglows.plots.retrospective(simulated_df)
hydroviewer_figure.show()

```

In [None]:
!pip install geoglows

import geoglows
import pandas as pd
import datetime as dt

comid = 760684821

simulated_df = geoglows.data.retrospective(comid)
simulated_df.index = pd.to_datetime(simulated_df.index)
simulated_df[simulated_df < 0] = 0
simulated_df.index = simulated_df.index.to_series().dt.strftime("%Y-%m-%d")
simulated_df.index = pd.to_datetime(simulated_df.index)

simulated_df.to_csv("{}_retroespective_data.csv".format(comid))

hydroviewer_figure = geoglows.plots.retrospective(simulated_df)
hydroviewer_figure.show()

# **Forecast**


## **Using the `geoglows` package**

### The `geoglows.data.forecast` Function
To obtain the forecast simulation for any river segment, you need the corresponding `comid`.

The function `geoglows.data.forecast` retrieves the average forecasted flow for a certain `river_id` on a certain date. Here’s how to use it:

- **Arguments**:
  - `river_id` (int): the ID of a stream, should be a 9-digit integer.
  - `date` (str, optional): Specifies the date to request in YYYYMMDD format, returns the latest available if not specified.
  - `format` (str): the format to return the data, either 'df' for DataFrame or 'xarray'. This argument is optional and defaults to 'df'.
  - `data_source` (str, optional): Specifies the source of the data. Options are 'rest' (GEOGLOWS Rest API) or 'aws' (AWS Open Data Program GEOGLOWS V2 S3 bucket). The default is 'aws'.

- **Returns**:
  - `pd.DataFrame` or `dict` or `str`: This contains the streamflow data.

### The `geoglows.plots.forecast` Function

The `geoglows.plots.forecast` function is designed to plot forecasted streamflow data that can be crucial for water resource management and planning. It allows the incorporation of return period data to contextualize the forecasts within historical statistical ranges.

- **Arguments**:
  - **df** (`pandas.core.frame.DataFrame`): The DataFrame containing the forecasted streamflow data. This is typically retrieved from `geoglows.data.forecast`.
  - **plot_type** (`str`, optional): Specifies the type of plot to generate, with 'plotly' as the default option.
  - **rp_df** (`pandas.core.frame.DataFrame`, optional): An optional DataFrame that includes return period data, which helps in comparing forecasted flows against historical extremes.
  - **plot_titles** (`list`, optional): A list of strings to enhance the plot's title, providing more context or information. Each string in the list appears on a new line in the plot's title.

- **Returns**:
  - **go.Figure**: A Plotly graph object figure, which can be displayed directly in a Jupyter notebook or saved as an HTML file for sharing or further analysis.


```python
!pip install geoglows

import geoglows
import pandas as pd
import datetime as dt

comid = 760684821

forecast_data = geoglows.data.forecast(comid)
forecast_data.index = pd.to_datetime(forecast_data.index)
forecast_data[forecast_data < 0] = 0
forecast_data.index = forecast_data.index.to_series().dt.strftime("%Y-%m-%d %H:%M:%S")
forecast_data.index = pd.to_datetime(forecast_data.index)

forecast_data.to_csv("{}_forecast_data.csv".format(comid))

hydroviewer_figure = geoglows.plots.forecast(df=forecast_data, plot_type = 'plotly')
hydroviewer_figure.show()

```

In [None]:
!pip install geoglows

import geoglows
import pandas as pd
import datetime as dt

comid = 760684821

forecast_data = geoglows.data.forecast(comid)
forecast_data.index = pd.to_datetime(forecast_data.index)
forecast_data[forecast_data < 0] = 0
forecast_data.index = forecast_data.index.to_series().dt.strftime("%Y-%m-%d %H:%M:%S")
forecast_data.index = pd.to_datetime(forecast_data.index)

forecast_data.to_csv("{}_forecast_data.csv".format(comid))

hydroviewer_figure = geoglows.plots.forecast(df=forecast_data, plot_type = 'plotly')
hydroviewer_figure.show()

# Forecast Ensembles

## **Using the `geoglows` package**

### The `geoglows.data.forecast_ensembles` Function
To obtain the forecast ensembles simulation for any river segment, you need the corresponding `comid`.

The function `geoglows.data.forecast_ensembles` retrieves the forecast ensembles simulation of streamflow for a given `river_id`. Here’s how to use it:

- **Arguments**:
  - `river_id` (int): the ID of a stream, should be a 9-digit integer.
  - `format` (str): the format to return the data, either 'df' for DataFrame or 'xarray'. This argument is optional and defaults to 'df'.
  - `date` (str, optional): Specifies the date for which to retrieve forecast data in YYYYMMDD format. If not specified, the latest available data will be returned.
  - `data_source` (str, optional): Specifies the source of the data. Options are 'rest' (GEOGLOWS Rest API) or 'aws' (AWS Open Data Program GEOGLOWS V2 S3 bucket). The default is 'aws'.

- **Returns**:
  - `pd.DataFrame` or `dict` or `str`: This contains the streamflow data.

### The `geoglows.plots.forecast_ensembles` Function

The `geoglows.plots.forecast_ensembles` function creates interactive visualizations using Plotly, which can help in analyzing the range and impact of forecasted streamflows under different scenarios.

- **Arguments**:
  - **df** (`pandas.core.frame.DataFrame`): The DataFrame containing the forecast ensemble data obtained from `geoglows.data.forecast_ensembles`.
  - **plot_type** (`str`, optional): Specifies the type of plot to generate, with 'plotly' as the default option.
  - **rp_df** (`pandas.core.frame.DataFrame`, optional): An optional DataFrame that includes return period data, which helps compare forecasted flows with historical extremes.
  - **plot_titles** (`list`, optional): A list of strings to enhance the plot title, providing more context or information. Each string in the list appears on a new line in the plot title.

- **Returns**:
  - **go.Figure**: A Plotly graphical object that can be displayed directly in a Jupyter notebook or saved as an HTML file for sharing or further analysis.

```python
!pip install geoglows

import geoglows
import pandas as pd
import datetime as dt

comid = 760684821

forecast_data = geoglows.data.forecast_ensembles(comid)
forecast_data.index = pd.to_datetime(forecast_data.index)
forecast_data[forecast_data < 0] = 0
forecast_data.index = forecast_data.index.to_series().dt.strftime("%Y-%m-%d %H:%M:%S")
forecast_data.index = pd.to_datetime(forecast_data.index)

forecast_data.to_csv("{}_forecast_data.csv".format(comid))

hydroviewer_figure = geoglows.plots.forecast_ensembles(df=forecast_data, plot_type = 'plotly')
hydroviewer_figure.show()
```

In [None]:
!pip install geoglows

import geoglows
import pandas as pd
import datetime as dt

comid = 760684821

forecast_data = geoglows.data.forecastensembles(comid)
forecast_data.index = pd.to_datetime(forecast_data.index)
forecast_data[forecast_data < 0] = 0
forecast_data.index = forecast_data.index.to_series().dt.strftime("%Y-%m-%d %H:%M:%S")
forecast_data.index = pd.to_datetime(forecast_data.index)

forecast_data.to_csv("{}_forecast_data.csv".format(comid))

hydroviewer_figure = geoglows.plots.forecast(df=forecast_data, plot_type = 'plotly')
hydroviewer_figure.show()

# Forecast Stats

## **Using the `geoglows` package**

### The `geoglows.data.forecast_stats` Function
To obtain a summary for the forecast ensembles, we can get the  the forecast stats, which includes the min, 25%, mean, median, 75%, and max river discharge of the 51 ensembles members for a river_id. The 52nd higher resolution member is also included in the forecast stats, you need the corresponding `comid`.

The function `geoglows.data.forecast_stats` retrieves the forecast ensembles simulation of streamflow for a given `river_id`. Here’s how to use it:

- **Arguments**:
  - `river_id` (int): the ID of a stream, should be a 9-digit integer.
  - `format` (str): the format to return the data, either 'df' for DataFrame or 'xarray'. This argument is optional and defaults to 'df'.
  - `date` (str, optional): Specifies the date for which to retrieve forecast data in YYYYMMDD format. If not specified, the latest available data will be returned.
  - `data_source` (str, optional): Specifies the source of the data. Options are 'rest' (GEOGLOWS Rest API) or 'aws' (AWS Open Data Program GEOGLOWS V2 S3 bucket). The default is 'aws'.

- **Returns**:
  - `pd.DataFrame` or `dict` or `str`: This contains the streamflow data.

### The `geoglows.plots.forecast_stats` Function

The `geoglows.plots.forecast_stats` function creates interactive visualizations using Plotly, which can help in analyzing the range and impact of forecasted streamflows under different scenarios.

- **Arguments**:
  - **df** (`pandas.core.frame.DataFrame`): The DataFrame containing the forecast stats data retrieved from `geoglows.data.forecast_stats`.
  - **plot_type** (`str`, optional): Specifies the type of plot to create. The default is 'plotly'.
  - **rp_df** (`pandas.core.frame.DataFrame`, optional): A DataFrame containing the return period data. Including this data provides context for the forecast by showing thresholds of statistical significance.
  - **plot_titles** (`list`, optional): A list of strings that will be included in the figure's title. Each item in the list will appear on a new line, allowing for detailed descriptions.

- **Returns**:
  - **go.Figure**: A Plotly graph object figure that can be displayed in a Jupyter notebook or saved to HTML.

```python
!pip install geoglows

import geoglows
import pandas as pd
import datetime as dt

comid = 760684821

forecast_data = geoglows.data.forecast_stats(comid)
forecast_data.index = pd.to_datetime(forecast_data.index)
forecast_data[forecast_data < 0] = 0
forecast_data.index = forecast_data.index.to_series().dt.strftime("%Y-%m-%d %H:%M:%S")
forecast_data.index = pd.to_datetime(forecast_data.index)

forecast_data.to_csv("{}_forecast_data.csv".format(comid))

hydroviewer_figure = geoglows.plots.forecast_stats(df=forecast_data, plot_type = 'plotly')
hydroviewer_figure.show()
```

In [None]:
!pip install geoglows

import geoglows
import pandas as pd
import datetime as dt

comid = 760684821

forecast_data = geoglows.data.forecast_stats(comid)
forecast_data.index = pd.to_datetime(forecast_data.index)
forecast_data[forecast_data < 0] = 0
forecast_data.index = forecast_data.index.to_series().dt.strftime("%Y-%m-%d %H:%M:%S")
forecast_data.index = pd.to_datetime(forecast_data.index)

forecast_data.to_csv("{}_forecast_data.csv".format(comid))

hydroviewer_figure = geoglows.plots.forecast_stats(df=forecast_data, plot_type = 'plotly')
hydroviewer_figure.show()