# Generate Lorenz data

This notebook accompanies the following publication:
Paul Platzer, Arthur Avenas, Bertrand Chapron, Lucas Drumetz, Alexis Mouche, Léo Vinour. Distance Learning for Analog Methods. 2024. [⟨hal-04841334⟩](https://hal.science/hal-04841334)

It is used to generate two trajectories of the Lorenz system that will be used in numerical experiments.

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from tqdm.notebook import tqdm
import sys
sys.path.append('../../functions/.')
from generate_lorenz import RK4, l63, integrate_l63

In [2]:
data_folder = '../../data/lorenz/'

# Generate catalog for sections 3.a and 3.c
In these sections, the algorithm is tested on different variables, at different forecasting horizons, and to compare the MSE-based and CRPS-based optimizations.

In [3]:
# Set parameters
Ntrain = 10**5
tau = 0.64 # time between two elements of the catalogue (2 "days")
dt = 0.01 # integration time-step
h_max = 4.48 # maximal forecast horizon to be tested (2 "weeks")
Ntraj = Ntrain*int(tau/dt) + int(h_max/dt)

# Integrate
traj = integrate_l63( dt = dt , N = Ntraj )

  0%|          | 0/6400447 [00:00<?, ?it/s]

In [5]:
# Normalize
stds = np.std(traj, axis=0)
traj_norm = traj/stds

In [7]:
# Save output (146.5 MB)
np.savez(data_folder + 'catalog_small.npz', 
         traj_norm = traj_norm, stds = stds,
        Ntrain = Ntrain, tau = tau, dt = dt, h_max = h_max)

# Generate catalogs for section 3.b
In this section, the algorithm is tested for different catalog sizes. The integration here is longer ($\sim$20 minutes per catalog, with 10 catalogs) and will generate larger files ($\sim$1.4GB per catalog). Depending on your computational resources and requirements, this code could modified to run in parallel.

In [8]:
# Set parameters
Ntrain = 10**6
tau = 0.64 # time between two elements of the catalogue (2 "days")
dt = 0.01 # integration time-step
h_max = 4.48 # maximal forecast horizon to be tested (2 "weeks")
Ntraj = Ntrain*int(tau/dt) + int(h_max/dt)

# Loop on catalog
for j in tqdm(range(10)):

    # Initial condition (there will be a spin-up of 1000 time-steps to ensure the beginning of the catalog is inside the attractor)
    train0 = np.array([1,1,1]) + [1,2,3,-1,-2,-3,1.5,2.5,-.25,-.5][j]*0.01*np.random.randn(3)
    
    # Integrate
    traj = integrate_l63( X0 = train0, dt = dt , N = Ntraj )
    
    # Normalize
    stds = np.std(traj, axis=0)
    traj_norm = traj/stds
    
    # Store
    np.savez(data_folder + 'catalog_large_'+str(j)+'.npz',
             traj_norm = traj_norm, stds = stds, 
            Ntrain = Ntrain, tau = tau, dt = dt, h_max = h_max)

  0%|          | 0/10 [00:00<?, ?it/s]

  0%|          | 0/64000447 [00:00<?, ?it/s]

KeyboardInterrupt: 