Extraemos los datos de 3 fuentes diferentes:

* Copernicus: DEsde donde extraemos los datos climáticos
* Sistemas de información de la Cuenca Hidrográfica del Jucar: Desde donde extraemos información de ríos y canales y de la Albufera
* CEDEX (Centro de Estudios y Experimentación de Obras Públicas): Desde donde extraemos información de los embalses de cuenca del Jucar
Además este último centro también dispone de datos diarios de ríos y canales. Para discernir hacemos una pequeña comparativa entre los datos de los ríos del CEDEX y de la los sistemas de información de la cuenca

In [1]:
import pandas as pd

# Tablas

## Tabla de fechas

In [101]:
start_date = "1900-01-01"
end_date = "2024-11-15"
dates = pd.date_range(start=start_date, end=end_date, freq='D')

# Crear DataFrame con las fechas
df_date = pd.DataFrame({'date': dates})

# Generar el identificador int como ddmmyyyy
df_date['date_id'] = df_date['date'].dt.strftime('%Y%m%d').astype(int)

df_date.to_csv('df_date.csv', index = False)

## Datos cuenca hidrográfica del Jucar

https://www.chj.es/es-es/medioambiente/sistemasdeinformacion/Paginas/Sistemasdeinformacion.aspx

### Ríos y canales

In [125]:
df_rios_canales = pd.read_csv('Extracción/Cuenca hidrográfica Juca SI/F2796_Rios_y_Canales_ROEA/F2796_D2_Serie día.csv', encoding='latin1', sep=';')
df_rios_canales = df_rios_canales.rename(columns = {'Cód. CHJ' : 'id_station','Fecha' : 'date','Cantidad (hm³)' : 'quantity_hm3'})
df_rios_canales = df_rios_canales[['id_station', 'date','quantity_hm3']]
df_rios_canales['date'] = pd.to_datetime(df_rios_canales['date'], format='%d-%m-%Y %H:%M:%S')
df_rios_canales = df_rios_canales.dropna()

df_rios_canales_id = pd.read_csv('Extracción/Cuenca hidrográfica Juca SI/F2796_Rios_y_Canales_ROEA/F2796_M0_Todas.csv', encoding='UTF-8',index_col = 0)
df_rios_canales_id = df_rios_canales_id.rename(columns = {'Cód. Estación' : 'id_station','Altitud (m)':'Altitud'})
df_rios_canales_info = df_rios_canales_id[['id_station','Estación de Aforo', 'Cód. ROEA', 'Tipo',
                                           'Cód. Munic.', 'Municipio', 'Cód. SE', 'Sistema de Explotación','Altitud']]
df_rios_canales_id = df_rios_canales_id[['id_station','latitude','longitude']]

#Crear columna pixels únicos
unique_pixels = df_rios_canales_id[['latitude', 'longitude']].drop_duplicates().reset_index(drop=True)
unique_pixels['location_id'] = range(len(unique_pixels))
df_rios_canales_id = pd.merge(df_rios_canales_id, unique_pixels, on=['latitude', 'longitude'], how='left')
df_rios_canales = pd.merge(df_rios_canales, df_rios_canales_id[['id_station', 'location_id']], on = 'id_station', how = 'left')
df_rios_canales_info = pd.merge(df_rios_canales_info, df_rios_canales_id[['id_station', 'location_id']], on = 'id_station', how = 'left')
df_rios_canales_id = df_rios_canales_id.drop('id_station', axis = 1)
df_rios_canales = df_rios_canales.drop('id_station', axis = 1)
df_rios_canales['quantity_hm3'] = df_rios_canales['quantity_hm3'].str.replace(',','.').astype('float')
#indicador de fecha
df_rios_canales = pd.merge(df_rios_canales, df_date, on = 'date', how = 'left').drop('date', axis = 1)
df_rios_canales_id.to_csv('df_rios_canales_id.csv',index = False)
df_rios_canales.to_csv('df_rios_canales.csv',index = False)
df_rios_canales_info.to_csv('df_rios_canales_info.csv',index = False)

## Datos de CEDEX

https://www.cedex.es/comunicacion/noticias/cedex-tiene-disposicion-publica-datos-anuario-aforos-desde-1912

#### Embalses.

* ref_ceh: Identificador del embalse
* fecha: Fecha (día/mes/año) en la que se toma el dato
* reserva: Reserva diaria (hm3)
* salida: Salida media diaria (m3/s)
* tipo: Identificador del tipo de medida (1 o 2). Tipo 1: La reserva se mide al final del día, por lo que al hacer el balance, la reserva del día siguiente se obtiene sumando a la reserva las entradas del día siguiente y restando las salidas del día siguiente. R(DIA 2)= R(DIA 1)+ E(DIA 2) - S(DIA 2) Tipo 2: La reservas se mide al comienzo del día, de manera que la reserva del día siguiente se obtiene sumando a la reserva las entradas del mismo díao3/ mo día

In [126]:
df_embalses = pd.read_csv('Extracción/CEDEX - CH Jucar/afliqe.csv',sep = ';')
df_embalses = df_embalses.rename(columns = {'reserva' : 'quantity_hm3','ref_ceh':'id_station','fecha' : 'date'})
df_embalses['date'] = pd.to_datetime(df_embalses['date'], format='%d/%m/%Y')
df_embalses = df_embalses[['id_station','date','quantity_hm3']]
df_embalses_id = pd.read_csv('Extracción/CEDEX - CH Jucar/Transformados/df_embalses_cedex_id.csv', index_col = 0)
df_embalses_id = df_embalses_id.rename(columns = {'ref_ceh':'id_station'})
df_embalses_id = df_embalses_id.drop('nom_embalse', axis = 1)
#Crear columna pixels únicos
unique_pixels = df_embalses_id[['latitude', 'longitude']].drop_duplicates().reset_index(drop=True)
unique_pixels['location_id'] = range(df_rios_canales_id['location_id'].max()+1, df_rios_canales_id['location_id'].max() + len(unique_pixels)+1)

df_embalses_id = pd.merge(df_embalses_id, unique_pixels, on=['latitude', 'longitude'], how='left')
df_embalses_info = pd.read_excel('Extracción/Cuenca hidrográfica Juca SI/F2797_Embalses_ROEA/F2797_M0_Embalses_ROEA.xlsx')
df_embalses_info = df_embalses_info.rename(columns = {'Cód. Embalse' : 'id_station'})
df_embalses_info = pd.merge(df_embalses_info, df_embalses_id[['id_station', 'location_id']], on = 'id_station')
df_embalses = pd.merge(df_embalses, df_embalses_id[['id_station', 'location_id']], on = 'id_station')
df_embalses_id = df_embalses_id.drop('id_station', axis = 1)
df_embalses = df_embalses.drop('id_station', axis = 1)
#indicador de fecha
df_embalses = pd.merge(df_embalses, df_date, on = 'date', how = 'left').drop('date', axis = 1)
#guardar csv
df_embalses_id.to_csv('df_embalses_id.csv',index = False)
df_embalses.to_csv('df_embalses.csv',index = False)
df_embalses_info.to_csv('df_embalses_info.csv',index = False)

## Copernicus

In [10]:
import os
import xarray as xr

directorio = 'Extracción/Copernicus - climate ERA5'
df_copernicus = pd.DataFrame()
agg_funcs = {
            'total_precipitation': 'sum',
            'skin_temperature': 'mean',
            'evaporation': 'sum',
            'runoff': 'sum',
            'snowfall': 'sum',
            'soil_water_l1': 'sum',
            'soil_water_l2': 'sum',
            'soil_water_l3': 'sum',
            'soil_water_l4': 'sum',
            'high_vegetation_cover': 'mean',
            'low_vegetation_cover': 'mean',
            'type_high_vegetation': 'mean',
            'type_low_vegetation': 'mean',
            'type_high_vegetation': lambda x: x.mode()[0] if not x.mode().empty else np.nan,
            'type_low_vegetation': lambda x: x.mode()[0] if not x.mode().empty else np.nan
        }
for archivo in os.listdir(directorio):
    if archivo.endswith('.nc'):
        print(archivo)
        archivo_nc = os.path.join(directorio, archivo)
        ds = xr.open_dataset(archivo_nc)
        df = ds.to_dataframe()
        df = df.reset_index()
        df = df.drop(['number','expver'], axis = 1, errors='ignore')
        df = df[['valid_time', 'latitude', 'longitude', 'tp', 'skt', 'e', 'ro', 'sf',
       'swvl1', 'swvl2', 'swvl3', 'swvl4', 'cvh', 'cvl', 'tvh', 'tvl']]
        df = df.rename(columns = {'valid_time': 'date',
             'latitude': 'latitude',
             'longitude': 'longitude',
             'tp': 'total_precipitation',
             'skt': 'skin_temperature',
             'e': 'evaporation',
             'ro': 'runoff',
             'sf': 'snowfall',
             'swvl1': 'soil_water_l1',
             'swvl2': 'soil_water_l2',
             'swvl3': 'soil_water_l3',
             'swvl4': 'soil_water_l4',
             'cvh': 'high_vegetation_cover',
             'cvl': 'low_vegetation_cover',
             'tvh': 'type_high_vegetation',
             'tvl': 'type_low_vegetation'})
        df['date'] = pd.to_datetime(df['date']).dt.date
        
        df = df.groupby(['latitude', 'longitude', 'date']).agg(agg_funcs).reset_index()
        df_copernicus = pd.concat([df_copernicus, df], ignore_index=True)

1960-01-01_a_1960-01-31.nc
1960-02-01_a_1960-02-29.nc
1960-03-01_a_1960-03-31.nc
1960-04-01_a_1960-04-30.nc
1960-05-01_a_1960-05-31.nc
1960-06-01_a_1960-06-30.nc
1960-07-01_a_1960-07-31.nc
1960-08-01_a_1960-08-31.nc
1960-09-01_a_1960-09-30.nc
1960-10-01_a_1960-10-31.nc
1960-11-01_a_1960-11-30.nc
1960-12-01_a_1960-12-31.nc
1961-01-01_a_1961-01-31.nc
1961-02-01_a_1961-02-28.nc
1961-03-01_a_1961-03-31.nc
1961-04-01_a_1961-04-30.nc
1961-05-01_a_1961-05-31.nc
1961-06-01_a_1961-06-30.nc
1961-07-01_a_1961-07-31.nc
1961-08-01_a_1961-08-31.nc
1961-09-01_a_1961-09-30.nc
1961-10-01_a_1961-10-31.nc
1961-11-01_a_1961-11-30.nc
1961-12-01_a_1961-12-31.nc
1962-01-01_a_1962-01-31.nc
1962-02-01_a_1962-02-28.nc
1962-03-01_a_1962-03-31.nc
1962-04-01_a_1962-04-30.nc
1962-05-01_a_1962-05-31.nc
1962-06-01_a_1962-06-30.nc
1962-07-01_a_1962-07-31.nc
1962-08-01_a_1962-08-31.nc
1962-09-01_a_1962-09-30.nc
1962-10-01_a_1962-10-31.nc
1962-11-01_a_1962-11-30.nc
1962-12-01_a_1962-12-31.nc
1963-01-01_a_1963-01-31.nc
1

KeyError: "['skt', 'swvl1', 'swvl2', 'swvl3', 'swvl4', 'cvh', 'cvl', 'tvh', 'tvl'] not in index"

In [128]:
#df_copernicus = pd.read_csv('df_copernicus.csv').drop('location_id', axis = 1)

df_copernicus['date'] = pd.to_datetime(df_copernicus['date'])

df_copernicus_id = df_copernicus[['latitude', 'longitude']].drop_duplicates().reset_index(drop=True)

# Asignar un ID único a cada combinación
df_copernicus_id['location_id'] = range(df_embalses_id['location_id'].max()+1, df_embalses_id['location_id'].max() + len(df_copernicus_id)+1)

df_copernicus = df_copernicus.merge(df_copernicus_id, on=['latitude', 'longitude'], how='left').drop(['latitude', 'longitude'],axis = 1)
df_copernicus = pd.merge(df_copernicus, df_date, on = 'date', how = 'left').drop('date', axis = 1)
df_copernicus.to_csv('df_copernicus.csv', index = False)
df_copernicus_id.to_csv('df_copernicus_id.csv', index = False)

## Aemet

In [147]:
df_aemet1 = pd.read_csv('Extracción/Aemet/2004-02-29_2024-09-30.csv')
df_aemet2 = pd.read_csv('Extracción/Aemet/1990-01-01_2004-02-19.csv')
df_aemet3 = pd.read_csv('Extracción/Aemet/1980-01-01_1989-12-01.csv')
df_aemet_info =  pd.read_csv('Extracción/Aemet/estaciones_aemet_id.csv')

df_aemet = pd.concat([df_aemet1, df_aemet2, df_aemet3])
df_aemet = df_aemet.drop([ 'horatmin', 'horatmax','hrMedia','hrMax', 'horaHrMax', 'horaHrMin', 'horaPresMax','horaPresMin'], axis = 1)
decimal_columns = ['tmed', 'prec', 'tmin', 'tmax',  'velmedia', 'racha',  'presMax', 'presMin', 'sol']
df_aemet['prec'] = df_aemet['prec'].replace('Ip', '0') 
df_aemet['prec'] = df_aemet['prec'].replace('Acum', '9999999999')

for col in decimal_columns:
    df_aemet[col] = df_aemet[col].str.replace(',', '.').astype(float)

df_aemet_id = df_aemet_info[['latitude', 'longitude']].drop_duplicates().reset_index(drop=True)
df_aemet_id['location_id'] = range(df_copernicus['location_id'].max()+1, df_copernicus['location_id'].max() + len(df_aemet_id)+1)
df_aemet_info = df_aemet_info.merge(df_aemet_id, on=['latitude', 'longitude'], how='left').drop(['latitude', 'longitude'],axis = 1)
df_aemet = df_aemet.merge(df_aemet_info[['indicativo', 'location_id']], on=['indicativo'], how='left').drop(['indicativo'],axis = 1)

max_value = df_aemet['prec'].max()
second_max_value = df_aemet['prec'][df_aemet['prec'] != max_value].max()
df_aemet['prec'] = df_aemet['prec'].replace(max_value, second_max_value)
df_aemet = df_aemet.rename(columns = {'fecha' : 'date'})
df_aemet['date'] = pd.to_datetime(df_aemet['date'])
#mergeamos con fecha
df_aemet = pd.merge(df_aemet,df_date, on = 'date', how = 'left').drop('date', axis = 1)
df_aemet.to_csv('df_aemet.csv',index = False)
df_aemet_id.to_csv('df_aemet_id.csv',index = False)
df_aemet_info.to_csv('df_aemet_info.csv',index = False)

### Tabla locations_id

In [150]:
df_rios_canales_id['Type'] = 'Rio'
df_embalses_id['Type'] = 'Embalse'
df_copernicus_id['Type'] = 'Copernicus'
df_aemet_id['Type'] = 'Aemet'

In [153]:
locations_id = pd.concat([df_rios_canales_id,df_embalses_id,df_copernicus_id,df_aemet_id])
locations_id.to_csv('locations_id.csv',index = False)

### Tabla pixels

In [169]:
from scipy.spatial.distance import cdist
import numpy as np

coords_copernicus = df_copernicus_id[['latitude', 'longitude']].to_numpy()
coords_rios_canales = df_rios_canales_id[['latitude', 'longitude']].to_numpy()
coords_embalses = df_embalses_id[['latitude', 'longitude']].to_numpy()
coords_aemet = df_aemet_id[['latitude', 'longitude']].to_numpy()

dist_coper_rios = cdist(coords_rios_canales, coords_copernicus, metric='euclidean')
dist_coper_embalses = cdist(coords_embalses, coords_copernicus, metric='euclidean')
dist_coper_aemet = cdist(coords_aemet, coords_copernicus, metric='euclidean')

closest__coper_rios = np.argmin(dist_coper_rios, axis=1)
closest_coper_embalses = np.argmin(dist_coper_embalses, axis=1)
closest_coper_aemet = np.argmin(dist_coper_aemet, axis=1)

df_closest_rios = pd.DataFrame({
    'location_id_rios_canales':np.array( df_rios_canales_id['location_id']),  # Índices de df_rios_canales_id
    'location_id_copernicus': closest__coper_rios,       # Índices correspondientes en pixels copernicus
    'latitude' : np.array(df_copernicus_id['latitude'][closest__coper_rios]),
    'longitude' : np.array(df_copernicus_id['longitude'][closest__coper_rios])
})

df_closest_embalses = pd.DataFrame({
    'location_id_embalses': np.array(df_embalses_id['location_id']),  # Índices de df_rios_canales_id
    'location_id_copernicus': closest_coper_embalses,       # Índices correspondientes en pixels copernicus
    'latitude' : np.array(df_copernicus_id['latitude'][closest_coper_embalses]),
    'longitude' : np.array(df_copernicus_id['longitude'][closest_coper_embalses])
})

df_closest_aemet = pd.DataFrame({
    'location_id_aemet': np.array(df_aemet_id['location_id']),  # Índices de df_rios_canales_id
    'location_id_copernicus': closest_coper_aemet,       # Índices correspondientes en pixels copernicus
    'latitude' : np.array(df_copernicus_id['latitude'][closest_coper_aemet]),
    'longitude' :np.array( df_copernicus_id['longitude'][closest_coper_aemet])
})

In [181]:
df_pixeles_cercanos = pd.merge(df_closest_rios, df_closest_embalses, on = ['location_id_copernicus','latitude', 'longitude'], how = 'outer')
df_pixeles_cercanos = pd.merge(df_pixeles_cercanos, df_closest_aemet, on = ['location_id_copernicus','latitude', 'longitude'], how = 'outer')
df_pixeles_cercanos = df_pixeles_cercanos[[ 'location_id_copernicus', 'location_id_embalses', 'location_id_aemet','location_id_rios_canales']]

In [183]:
df_pixeles_cercanos.to_csv('df_pixeles_cercanos.csv',index = False) 

In [189]:
df_aemet_info.dtypes

provincia       object
altitud          int64
indicativo      object
nombre          object
indsinop       float64
location_id      int64
dtype: object

# Base de datos

## SQL

Habiendo construido todas las tablas, ahora elaboramos una base relacional de esta forma:

![title](Schema.png)

In [220]:
df_pixeles_cercanos = pd.read_csv('df_pixeles_cercanos.csv')
df_copernicus_id = pd.read_csv('df_copernicus_id.csv')
df_copernicus = pd.read_csv('df_copernicus.csv')
df_embalses = pd.read_csv('df_embalses.csv')
df_embalses_id = pd.read_csv('df_embalses_id.csv')
df_embalses_info = pd.read_csv('df_embalses_info.csv')
df_rios_canales = pd.read_csv('df_rios_canales.csv')
df_rios_canales_id = pd.read_csv('df_rios_canales_id.csv')
df_rios_canales_info = pd.read_csv('df_rios_canales_info.csv')
df_aemet = pd.read_csv('df_aemet.csv')
df_aemet_id = pd.read_csv('df_aemet_id.csv')
df_aemet_info = pd.read_csv('df_aemet_info.csv')
df_date = pd.read_csv('df_date.csv')
locations_id = pd.read_csv('locations_id.csv')

In [225]:
import sqlite3

# Conectar a la base de datos SQLite (o crearla si no existe)
conn = sqlite3.connect('cantidadAguaJucar.db')
cursor = conn.cursor()
cursor.execute("PRAGMA foreign_keys = ON;")

sql_script = '''
CREATE TABLE `df_pixeles_cercanos` (
  `location_id_copernicus` int64,
  `location_id_embalses` float64,
  `location_id_aemet` float64,
  `location_id_rios_canales` float64
);

CREATE TABLE `df_copernicus_id` (
  `latitude` float64,
  `longitude` float64,
  `location_id` int64
);

CREATE TABLE `df_copernicus` (
  `total_precipitation` float64,
  `skin_temperature` float64,
  `evaporation` float64,
  `runoff` float64,
  `snowfall` float64,
  `soil_water_l1` float64,
  `soil_water_l2` float64,
  `soil_water_l3` float64,
  `soil_water_l4` float64,
  `high_vegetation_cover` float64,
  `low_vegetation_cover` float64,
  `type_high_vegetation` float64,
  `type_low_vegetation` float64,
  `location_id` int64,
  `date_id` int64
);

CREATE TABLE `df_embalses` (
  `quantity_hm3` float64,
  `location_id` int64,
  `date_id` int64
);

CREATE TABLE `df_embalses_id` (
  `longitude` float64,
  `latitude` float64,
  `location_id` int64
);

CREATE TABLE `df_embalses_info` (
  `id_station` int64,
  `Embalse` varchar(255),
  `CodROEA` int64,
  `CodPresaprincipal` varchar(255),
  `PresaPrincipal` varchar(255),
  `VolUtil_hm3` float64,
  `CodMunic` int64,
  `Municipio` varchar(255),
  `CodProv` int64,
  `Provincia` varchar(255),
  `CodSE` int64,
  `SistemadeExplotación` varchar(255),
  `Cauce` varchar(255),
  `CodMasaSuperfPHJ22` varchar(255),
  `MasaSuperficialPHJ22` varchar(255),
  `location_id` int64
);

CREATE TABLE `df_rios_canales` (
  `quantity_hm3` float64,
  `location_id` int64,
  `date_id` int64
);

CREATE TABLE `df_rios_canales_id` (
  `latitude` float64,
  `longitude` float64,
  `location_id` int64
);

CREATE TABLE `df_rios_canales_info` (
  `id_station` int64,
  `EstacióndeAforo` varchar(255),
  `CodROEA` int64,
  `Tipo` varchar(255),
  `CodMunic` int64,
  `Municipio` varchar(255),
  `CodSE` float64,
  `SistemadeExplotación` varchar(255),
  `Altitud` varchar(255),
  `location_id` int64
);

CREATE TABLE `df_aemet` (
  `altitud` int64,
  `tmed` float64,
  `prec` float64,
  `tmin` float64,
  `tmax` float64,
  `dir` float64,
  `velmedia` float64,
  `racha` float64,
  `horaracha` varchar(255),
  `hrMin` float64,
  `presMax` float64,
  `presMin` float64,
  `sol` float64,
  `location_id` int64,
  `date_id` int64
);

CREATE TABLE `df_aemet_id` (
  `latitude` float64,
  `longitude` float64,
  `location_id` int64
);

CREATE TABLE `df_aemet_info` (
  `provincia` object,
  `altitud` int64,
  `indicativo` object,
  `nombre` object,
  `indsinop` float64,
  `location_id` int64
);

CREATE TABLE `df_date` (
  `date` timestamp,
  `date_id` int64
);

CREATE TABLE `locations_id` (
  `latitude` float64,
  `longitude` float64,
  `location_id` int64,
  `Type` object
);

ALTER TABLE `df_date` ADD FOREIGN KEY (`date_id`) REFERENCES `df_copernicus` (`date_id`);

ALTER TABLE `df_date` ADD FOREIGN KEY (`date_id`) REFERENCES `df_aemet` (`date_id`);

ALTER TABLE `df_date` ADD FOREIGN KEY (`date_id`) REFERENCES `df_rios_canales` (`date_id`);

ALTER TABLE `df_date` ADD FOREIGN KEY (`date_id`) REFERENCES `df_embalses` (`date_id`);

ALTER TABLE `locations_id` ADD FOREIGN KEY (`location_id`) REFERENCES `df_copernicus_id` (`location_id`);

ALTER TABLE `locations_id` ADD FOREIGN KEY (`location_id`) REFERENCES `df_rios_canales_id` (`location_id`);

ALTER TABLE `locations_id` ADD FOREIGN KEY (`location_id`) REFERENCES `df_embalses_id` (`location_id`);

ALTER TABLE `locations_id` ADD FOREIGN KEY (`location_id`) REFERENCES `df_aemet_id` (`location_id`);

ALTER TABLE `df_copernicus` ADD FOREIGN KEY (`location_id`) REFERENCES `locations_id` (`location_id`);

ALTER TABLE `df_rios_canales` ADD FOREIGN KEY (`location_id`) REFERENCES `locations_id` (`location_id`);

ALTER TABLE `df_embalses` ADD FOREIGN KEY (`location_id`) REFERENCES `locations_id` (`location_id`);

ALTER TABLE `df_aemet` ADD FOREIGN KEY (`location_id`) REFERENCES `locations_id` (`location_id`);

ALTER TABLE `df_pixeles_cercanos` ADD FOREIGN KEY (`location_id_copernicus`) REFERENCES `locations_id` (`location_id`);

ALTER TABLE `df_pixeles_cercanos` ADD FOREIGN KEY (`location_id_rios_canales`) REFERENCES `locations_id` (`location_id`);

ALTER TABLE `df_pixeles_cercanos` ADD FOREIGN KEY (`location_id_embalses`) REFERENCES `locations_id` (`location_id`);

ALTER TABLE `df_pixeles_cercanos` ADD FOREIGN KEY (`location_id_aemet`) REFERENCES `locations_id` (`location_id`);

ALTER TABLE `df_rios_canales_info` ADD FOREIGN KEY (`location_id`) REFERENCES `df_rios_canales` (`location_id`);

ALTER TABLE `df_embalses_info` ADD FOREIGN KEY (`location_id`) REFERENCES `df_embalses` (`location_id`);

ALTER TABLE `df_aemet_info` ADD FOREIGN KEY (`location_id`) REFERENCES `df_aemet` (`location_id`);'''

# Ejecutar el script
cursor.executescript(sql_script)

# Confirmar los cambios
conn.commit()

# Cerrar la conexión
conn.close()

OperationalError: table `df_pixeles_cercanos` already exists

In [12]:
conn = sqlite3.connect('cantidadAguaJucar.db')

# Volcar los datos de los DataFrames a las tablas correspondientes
df_dates.to_sql('dates', conn, if_exists='append', index=False)
df_copernicus.to_sql('df_copernicus', conn, if_exists='append', index=False)
df_embalses_cedex.to_sql('df_embalses_cedex', conn, if_exists='append', index=False)
df_embalses_cedex_id.to_sql('df_embalses_cedex_id', conn, if_exists='append', index=False)
df_rios_canales.to_sql('df_rios_canales', conn, if_exists='append', index=False)
df_rios_canales_id.to_sql('df_rios_canales_id', conn, if_exists='append', index=False)
df_albufera.to_sql('df_albufera', conn, if_exists='append', index=False)
df_albufera_id.to_sql('df_albufera_id', conn, if_exists='append', index=False)
df_pixeles_cercanos.to_sql('pixeles_cercanos', conn, if_exists='append', index=False)
pixels_copernicus.to_sql('pixels_copernicus', conn, if_exists='append', index=False)
conn.close()


In [20]:
conn = sqlite3.connect('cantidadAguaJucar.db')

cursor = conn.cursor()

query = "SELECT df_copernicus.total_precipitation, dates.date, FROM df_copernicus JOIN dates ON df_copernicus.date_id = dates.date_id LIMIT 100;"

df = pd.read_sql_query(query, conn)


conn.close()
df

Unnamed: 0,total_precipitation,date
0,0.000000e+00,2000-01-01T00:00:00
1,0.000000e+00,2000-01-02T00:00:00
2,1.452863e-07,2000-01-03T00:00:00
3,0.000000e+00,2000-01-04T00:00:00
4,0.000000e+00,2000-01-05T00:00:00
...,...,...
95,3.184192e-06,2000-01-03T00:00:00
96,0.000000e+00,2000-01-04T00:00:00
97,0.000000e+00,2000-01-05T00:00:00
98,0.000000e+00,2000-01-06T00:00:00
