# Carga Datos Vid (Oidio)
## Pablo Lavín
## TFM - Master Ciencia de Datos (UC-UIMP)
## Beca JAE Intro ICU 2025
## Septiembre 2025 - Junio 2026

En este notebook se realiza la carga y preprocesamiento de los datos climáticos para el periodo 1981-2022, empleando proyecciones del modelo del Centro Europeo (ECMWF) y observaciones de referencia del reanálisis ERA5-Land.

El análisis se centra en la temperatura media (tas) y la humedad relativa (hr) de las estaciones de mayo y junio (fases críticas de la floración de la vid), preparando la información para su posterior evaluación y comparación.

## Configuración

In [1]:
# Cargamos paquetes
source("../../../scripts/setup_libraries.R")
source("../../../scripts/load_bc_functions.R")


Attaching package: ‘dplyr’


The following objects are masked from ‘package:stats’:

    filter, lag


The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union


Loading required package: rJava

Loading required package: loadeR.java

Java version 23x amd64 by N/A detected

NetCDF Java Library v4.6.0-SNAPSHOT (23 Apr 2015) loaded and ready

Loading required package: climate4R.UDG

climate4R.UDG version 0.2.6 (2023-06-26) is loaded

Please use 'citation("climate4R.UDG")' to cite this package.

loadeR version 1.8.1 (2023-06-22) is loaded


Get the latest stable version (1.8.2) using <devtools::install_github(c('SantanderMetGroup/climate4R.UDG','SantanderMetGroup/loadeR'))>

Please use 'citation("loadeR")' to cite this package.




    _______   ____  ___________________  __  ________ 
   / ___/ /  / /  |/  / __  /_  __/ __/ / / / / __  / 
  / /  / /  / / /|_/ / /_/ / / / / __/ / /_/ / /_/_/  
 / /__/ /__/ / /  / / __  / / / / /__ /___  / / \ \ 
 \___/____/_/_/  /_/_/ /_/ /_/  \___/    /_/\/   \_\ 
 
      github.com/SantanderMetGroup/climate4R



transformeR version 2.2.2 (2023-10-26) is loaded


Get the latest stable version (2.2.3) using <devtools::install_github('SantanderMetGroup/transformeR')>

Please see 'citation("transformeR")' to cite this package.

Loading required package: udunits2

udunits system database read from /vols/abedul/home/meteo/lavinp/miniforge3/envs/C4R/share/udunits/udunits2.xml

convertR version 0.3.0 (2025-07-31) is loaded


Development version may have an unexpected behaviour

  More information about the 'climate4R' ecosystem in: http://meteo.unican.es/climate4R


Attaching package: ‘convertR’


The following objects are masked from ‘package:loadeR’:

    hurs2huss, huss2hurs, tdps2hurs


visualizeR version 1.6.4 (2023-10-26) is loaded

Please see 'citation("visualizeR")' to cite this package.

downscaleR version 3.3.4 (2023-06-22) is loaded

Please use 'citation("downscaleR")' to cite this package.

Loading required package: climdex.pcic

Loading required package: PCICt

climate4R.climdex version 0

In [2]:
# Region de estudio

lon = c(-10, 5)
lat = c(35, 44)

## Cargar hindcast tas (leadtime 0, 1, 2, 3)

In [3]:
anios = 1981:2016
meses_ini = c("05", "04", "03", "02")  # meses de inicialización

# Función generalizada para cargar los datos por mes y año
cargar_dato = function(anio, mes_ini) {

    yyyymm = paste0(anio, mes_ini)

    ruta = paste0(
        "/lustre/gmeteo/PTICLIMA/DATA/SEASONAL/",
        "seasonal-original-single-levels/medcof/hindcast/tas/ecmwf/51/", yyyymm, "/",
        "seasonal-original-single-levels_medcof_hindcast_tas_ecmwf_51_", yyyymm, ".ncml"
    )

    data_aux = loadGridData(dataset = ruta,
                            var = "tas",
                            lonLim = lon,
                            latLim = lat,
                            season = c(5, 6)) %>% suppressMessages %>% suppressWarnings

    return(data_aux)
}

# Creo una lista donde cada elemento es la salida para un mes de inicialización diferente
hindcast = lapply(meses_ini, function(mes) {
    lapply(anios, function(anio) cargar_dato(anio, mes))
})

# Nombro los elementos por mes
names(hindcast) = paste0("mes_", meses_ini)

In [4]:
# Asigno nombre a cada grid con los leadtime
hindcast_0 = hindcast[["mes_05"]]
hindcast_1 = hindcast[["mes_04"]]
hindcast_2 = hindcast[["mes_03"]]
hindcast_3 = hindcast[["mes_02"]]

# Combinamos los grids en la dimensión temporal
hindcast_0_grid = bindGrid(hindcast_0, dimension = "time")
hindcast_1_grid = bindGrid(hindcast_1, dimension = "time")
hindcast_2_grid = bindGrid(hindcast_2, dimension = "time")
hindcast_3_grid = bindGrid(hindcast_3, dimension = "time")

## Cargar forecast tas (leadtime 0, 1, 2, 3)

In [5]:
anios = 2017:2022
meses_ini = c("05", "04", "03", "02")  # meses de inicialización

# Función generalizada para cargar los datos por mes y año
cargar_dato = function(anio, mes_ini) {

    yyyymm = paste0(anio, mes_ini)

    ruta = paste0(
        "/lustre/gmeteo/PTICLIMA/DATA/SEASONAL/",
        "seasonal-original-single-levels/medcof/forecast/tas/ecmwf/51/", yyyymm, "/",
        "seasonal-original-single-levels_medcof_forecast_tas_ecmwf_51_", yyyymm, ".ncml"
    )

    data_aux = loadGridData(dataset = ruta,
                            var = "tas",
                            lonLim = lon,
                            latLim = lat,
                            season = c(5, 6)) %>% suppressMessages %>% suppressWarnings

    return(data_aux)
}

# Creo una lista donde cada elemento es la salida para un mes de inicialización diferente
forecast = lapply(meses_ini, function(mes) {
    lapply(anios, function(anio) cargar_dato(anio, mes))
})

# Nombro los elementos por mes
names(forecast) = paste0("mes_", meses_ini)

In [6]:
# Asigno nombre a cada grid con los leadtime
forecast_0 = forecast[["mes_05"]]
forecast_1 = forecast[["mes_04"]]
forecast_2 = forecast[["mes_03"]]
forecast_3 = forecast[["mes_02"]]

# Combinamos los grids en la dimensión temporal
forecast_0_grid = bindGrid(forecast_0, dimension = "time")
forecast_1_grid = bindGrid(forecast_1, dimension = "time")
forecast_2_grid = bindGrid(forecast_2, dimension = "time")
forecast_3_grid = bindGrid(forecast_3, dimension = "time")

# Me quedo con los primeros 25 miembros
forecast_0_members = subsetGrid(forecast_0_grid, members = 1:25)
forecast_1_members = subsetGrid(forecast_1_grid, members = 1:25)
forecast_2_members = subsetGrid(forecast_2_grid, members = 1:25)
forecast_3_members = subsetGrid(forecast_3_grid, members = 1:25)

# Combinamos los grids de hindcast y forecast
ecmwf_0_grid = bindGrid(hindcast_0_grid, forecast_0_members, dimension = "time")
ecmwf_1_grid = bindGrid(hindcast_1_grid, forecast_1_members, dimension = "time")
ecmwf_2_grid = bindGrid(hindcast_2_grid, forecast_2_members, dimension = "time")
ecmwf_3_grid = bindGrid(hindcast_3_grid, forecast_3_members, dimension = "time")

## Cargar tas ERA5-Land

In [7]:
# Define años y meses
anios = 1981:2022
meses = sprintf("%02d", c(5, 6))

# Función para construir la ruta y cargar los datos
cargar_dato = function(anio, mes) {
    yyyy = paste0(anio)
    yyyymm = paste0(anio, mes)
    ruta = paste0(
        "/lustre/gmeteo/PTICLIMA/DATA/REANALYSIS/ERA5-Land/data/Iberia/day/t2m/", yyyy, "/",
        "t2m_ERA5-Land_", yyyymm, ".nc"
    )
    
    # Carga el dataset
    data_aux = loadGridData(dataset = ruta,
                            var = "t2m",
                            lonLim = lon,
                            latLim = lat) %>% suppressMessages %>% suppressWarnings

    return(data_aux)
}

# Crear combinaciones
combinaciones = do.call(rbind, lapply(anios, function(anio) {
    data.frame(
        anio = c(anio, anio),
        mes  = c("05", "06"),
        stringsAsFactors = FALSE)
}))

# Aplicar la función a cada combinación
era5_data = lapply(1:nrow(combinaciones), function(i) {
    cargar_dato(combinaciones$anio[i], combinaciones$mes[i])
})

# Combinamos los grids en la dimensión temporal
era5_time = bindGrid(era5_data, dimension = "time")

# Upscaling de la resolución de las observaciones
era5_ups = interpGrid(era5_time,
                      new.coordinates = getGrid(ecmwf_0_grid),
                      method = "bilinear") %>% suppressMessages %>% suppressWarnings

# Pasamos las observaciones de Kelvin a Celsius
tas_obs_cel = gridArithmetics(era5_ups, 273.15, operator = "-")
attr(tas_obs_cel$Variable, "units") = "degC"

In [8]:
# Guardo los datos
saveRDS(ecmwf_0_grid, file = "tas_oidio_model_vid_0.rds")
saveRDS(ecmwf_1_grid, file = "tas_oidio_model_vid_1.rds")
saveRDS(ecmwf_2_grid, file = "tas_oidio_model_vid_2.rds")
saveRDS(ecmwf_3_grid, file = "tas_oidio_model_vid_3.rds")

saveRDS(tas_obs_cel, file = "tas_oidio_obs_vid.rds")

## Cargar hindcast hr (leadtime 0, 1, 2, 3)

In [9]:
anios = 1981:2016
meses_ini = c("05", "04", "03", "02")  # meses de inicialización

# Función generalizada para cargar los datos por mes y año
cargar_dato = function(anio, mes_ini) {

    yyyymm = paste0(anio, mes_ini)

    ruta = paste0(
        "/lustre/gmeteo/PTICLIMA/DATA/SEASONAL/",
        "seasonal-original-single-levels/medcof/hindcast/hurs/ecmwf/51/", yyyymm, "/",
        "seasonal-original-single-levels_medcof_hindcast_hurs_ecmwf_51_", yyyymm, ".ncml"
    )

    data_aux = loadGridData(dataset = ruta,
                            var = "hurs",
                            lonLim = lon,
                            latLim = lat,
                            season = c(5, 6)) %>% suppressMessages %>% suppressWarnings

    return(data_aux)
}

# Creo una lista donde cada elemento es la salida para un mes de inicialización diferente
hindcast = lapply(meses_ini, function(mes) {
    lapply(anios, function(anio) cargar_dato(anio, mes))
})

# Nombro los elementos por mes
names(hindcast) = paste0("mes_", meses_ini)

In [10]:
# Asigno nombre a cada grid con los leadtime
hindcast_0 = hindcast[["mes_05"]]
hindcast_1 = hindcast[["mes_04"]]
hindcast_2 = hindcast[["mes_03"]]
hindcast_3 = hindcast[["mes_02"]]

# Combinamos los grids en la dimensión temporal
hindcast_0_grid = bindGrid(hindcast_0, dimension = "time")
hindcast_1_grid = bindGrid(hindcast_1, dimension = "time")
hindcast_2_grid = bindGrid(hindcast_2, dimension = "time")
hindcast_3_grid = bindGrid(hindcast_3, dimension = "time")

## Cargar forecast hr (leadtime 0, 1, 2, 3)

In [11]:
anios = 2017:2022
meses_ini = c("05", "04", "03", "02")  # meses de inicialización

# Función generalizada para cargar los datos por mes y año
cargar_dato = function(anio, mes_ini) {

    yyyymm = paste0(anio, mes_ini)

    ruta = paste0(
        "/lustre/gmeteo/PTICLIMA/DATA/SEASONAL/",
        "seasonal-original-single-levels/medcof/forecast/hurs/ecmwf/51/", yyyymm, "/",
        "seasonal-original-single-levels_medcof_forecast_hurs_ecmwf_51_", yyyymm, ".ncml"
    )

    data_aux = loadGridData(dataset = ruta,
                            var = "hurs",
                            lonLim = lon,
                            latLim = lat,
                            season = c(5, 6)) %>% suppressMessages %>% suppressWarnings

    return(data_aux)
}

# Creo una lista donde cada elemento es la salida para un mes de inicialización diferente
forecast = lapply(meses_ini, function(mes) {
    lapply(anios, function(anio) cargar_dato(anio, mes))
})

# Nombro los elementos por mes
names(forecast) = paste0("mes_", meses_ini)

In [12]:
# Asigno nombre a cada grid con los leadtime
forecast_0 = forecast[["mes_05"]]
forecast_1 = forecast[["mes_04"]]
forecast_2 = forecast[["mes_03"]]
forecast_3 = forecast[["mes_02"]]

# Combinamos los grids en la dimensión temporal
forecast_0_grid = bindGrid(forecast_0, dimension = "time")
forecast_1_grid = bindGrid(forecast_1, dimension = "time")
forecast_2_grid = bindGrid(forecast_2, dimension = "time")
forecast_3_grid = bindGrid(forecast_3, dimension = "time")

# Me quedo con los primeros 25 miembros
forecast_0_members = subsetGrid(forecast_0_grid, members = 1:25)
forecast_1_members = subsetGrid(forecast_1_grid, members = 1:25)
forecast_2_members = subsetGrid(forecast_2_grid, members = 1:25)
forecast_3_members = subsetGrid(forecast_3_grid, members = 1:25)

# Combinamos los grids de hindcast y forecast
ecmwf_0_grid = bindGrid(hindcast_0_grid, forecast_0_members, dimension = "time")
ecmwf_1_grid = bindGrid(hindcast_1_grid, forecast_1_members, dimension = "time")
ecmwf_2_grid = bindGrid(hindcast_2_grid, forecast_2_members, dimension = "time")
ecmwf_3_grid = bindGrid(hindcast_3_grid, forecast_3_members, dimension = "time")

## Cargar punto de rocío (d2m) ERA5-Land

In [13]:
# Define años y meses
anios = 1981:2022
meses = sprintf("%02d", c(5, 6))

# Función para construir la ruta y cargar los datos
cargar_dato = function(anio, mes) {
    yyyy = paste0(anio)
    yyyymm = paste0(anio, mes)
    ruta = paste0(
        "/lustre/gmeteo/PTICLIMA/DATA/REANALYSIS/ERA5-Land/data/Iberia/day/d2m/", yyyy, "/",
        "d2m_ERA5-Land_", yyyymm, ".nc"
    )
    
    # Carga el dataset
    data_aux = loadGridData(dataset = ruta,
                            var = "d2m",
                            lonLim = lon,
                            latLim = lat) %>% suppressMessages %>% suppressWarnings

    return(data_aux)
}

# Crear combinaciones enero + febrero
combinaciones = do.call(rbind, lapply(anios, function(anio) {
    data.frame(
        anio = c(anio, anio),
        mes  = c("05", "06"),
        stringsAsFactors = FALSE)
}))

# Aplicar la función a cada combinación
d2m_era5_data = lapply(1:nrow(combinaciones), function(i) {
    cargar_dato(combinaciones$anio[i], combinaciones$mes[i])
})

# Combinamos los grids en la dimensión temporal
d2m_era5_time = bindGrid(d2m_era5_data, dimension = "time")

# Upscaling de la resolución de las observaciones
d2m_era5_ups = interpGrid(d2m_era5_time,
                          new.coordinates = getGrid(ecmwf_0_grid),
                          method = "bilinear") %>% suppressMessages %>% suppressWarnings

# Pasamos las observaciones de Kelvin a Celsius
d2m_obs_cel = gridArithmetics(d2m_era5_ups, 273.15, operator = "-")
attr(d2m_obs_cel$Variable, "units") = "degC"

# Calculo la humedad relativa de ERA-5 a patir del punto de rocío y de la temperatura
hr_obs = tdps2hurs(tdps = d2m_obs_cel, tas = tas_obs_cel) %>% suppressMessages %>% suppressWarnings

In [14]:
# Guardo los datos
saveRDS(ecmwf_0_grid, file = "hr_oidio_model_vid_0.rds")
saveRDS(ecmwf_1_grid, file = "hr_oidio_model_vid_1.rds")
saveRDS(ecmwf_2_grid, file = "hr_oidio_model_vid_2.rds")
saveRDS(ecmwf_3_grid, file = "hr_oidio_model_vid_3.rds")

saveRDS(hr_obs, file = "hr_oidio_obs_vid.rds")