# Decoding Financial Market Oscillations: A Fourier and Wavelet Transform Analysis
## apoth3osis R&D

Welcome to this advanced analytics notebook from apoth3osis R&D. In the unpredictable realm of financial markets, identifying underlying patterns is key to developing robust algorithmic trading strategies. This notebook explores the power of Fourier and Wavelet Transforms to uncover hidden oscillatory behaviors within minute-level EUR/USD exchange rate data. By transforming price movements into their constituent frequencies, we aim to isolate significant cyclical components and understand how they contribute to overall market dynamics. This foundational research seeks to determine if sufficient signal exists within the data to justify the application of advanced machine learning techniques, including those leveraging Fourier and Wavelet analysis for predictive modeling and investment opportunities.

## 1. Environment Setup and Data Ingestion

This section prepares the Python environment by importing necessary libraries and provides a mechanism to securely upload the financial time series data. We will use `pandas` for data manipulation, `numpy` for numerical operations, `matplotlib` for visualization, `scipy.fftpack` for Fourier Transforms, `scipy.signal` for signal processing, `statsmodels` for time series analysis, and `pywt` for Wavelet Transforms. A robust file upload method is implemented to ensure data privacy and ease of use within the Colab environment.

In [None]:
# Consolidate all library imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from scipy.fftpack import fft, ifft, fftfreq
from scipy import signal
from statsmodels.tsa.stattools import acf
from statsmodels.graphics.tsaplots import plot_acf
from sklearn.preprocessing import StandardScaler
import pywt
from google.colab import files, drive
from mpl_toolkits.mplot3d import Axes3D # For Bloch Sphere visualization
import os
from multiprocessing import Pool, cpu_count
from sklearn.metrics import mean_squared_error
from datetime import timedelta

# File upload mechanism
def upload_data():
    uploaded = files.upload()
    for fn in uploaded.keys():
        print(f'User uploaded file "{fn}"')
        return fn # Assuming single file upload
    return None

file_name = None
try:
    file_name = upload_data()
    if file_name:
        df = pd.read_csv(f'/content/{file_name}')
        df['Date'] = pd.to_datetime(df['Date'])
        df.set_index('Date', inplace=True)
        print("\nData loaded successfully. Displaying head and info:")
        print(df.head())
        print(df.info())
    else:
        print("No file uploaded. Please upload the 'complete_interpolated_eur_usd_data.csv' file.")
except FileNotFoundError:
    print(f"Error: The file {file_name} was not found. Please ensure it is correctly uploaded.")
except pd.errors.EmptyDataError:
    print(f"Error: The file {file_name} is empty.")
except pd.errors.ParserError:
    print(f"Error: Could not parse {file_name}. Please check the CSV format.")
except Exception as e:
    print(f"An unexpected error occurred: {e}")

# Ensure the 'EUR/USD' column is available and resample if data is loaded
df_resampled = pd.DataFrame() # Initialize empty DataFrame
if 'df' in locals() and not df.empty:
    if 'EUR/USD' in df.columns:
        # Resample the data to a uniform time interval (e.g., 1 minute)
        # This is crucial for consistent FFT analysis, which assumes uniformly sampled data.
        df_resampled = df.resample('1Min').mean().dropna().reset_index() # Drop NaNs introduced by resampling
        print("\nData resampled to 1-minute intervals and NaNs dropped. Displaying head:")
        print(df_resampled.head())
    else:
        print("Error: 'EUR/USD' column not found in the uploaded data.")
else:
    print("DataFrame 'df' is not loaded or is empty. Cannot proceed with resampling.")


## 2. Charting the Original Signal

Visualizing the raw EUR/USD exchange rate data is the first step in understanding its behavior. This chart provides a macroscopic view of the signal over time, highlighting major trends, volatility shifts, and any apparent long-term patterns. It serves as a baseline for comparing subsequent transformations and analyses.

In [None]:
if not df_resampled.empty:
    try:
        # Visualize the data
        plt.figure(figsize=(15, 7))
        plt.plot(df_resampled['Date'], df_resampled['EUR/USD'], color='blue', linewidth=0.8)
        plt.xlabel('Date', fontsize=12)
        plt.ylabel('EUR/USD Price', fontsize=12)
        plt.title('Forex Pairing Price Chart (1-Minute Resampled Data)', fontsize=14)
        plt.grid(True, linestyle=':', alpha=0.6)
        plt.tight_layout()
        plt.savefig('/content/forex_price_chart.png')
        plt.show()
        print("Forex price chart saved as 'forex_price_chart.png'")
    except Exception as e:
        print(f"Error charting original signal: {e}")
else:
    print("Cannot chart original signal as 'df_resampled' is empty.")


## 3. Fourier Transform Analysis: Decomposing the Signal

The Fourier Transform (FT) is a powerful mathematical tool that decomposes a time-domain signal into its constituent frequencies. This allows us to understand the various cycles and periodicities present in the data. For financial data, this can reveal underlying market rhythms that might not be obvious in the raw price series. The output of the FT provides information on the amplitude (strength) and phase (starting point) of each frequency component.

**Why it's useful:** The FT helps us identify dominant cycles that could be driving price movements. For instance, a strong amplitude at a particular frequency might indicate a recurring pattern that could be leveraged in trading strategies. The DC component (zero frequency) represents the average value or overall trend of the signal. In this analysis, we will specifically remove the DC component to focus on the oscillatory behavior around the trend, as the user is interested in predictable patterns of otherwise unpredictable data.

In [None]:
def perform_fourier_transform(data_series):
    """Performs Fourier Transform on a given time series.

    Args:
        data_series (pd.Series): The time series data (e.g., 'EUR/USD' prices).

    Returns:
        pd.DataFrame: A DataFrame containing Frequency, Amplitude, Phase, Time, Price, and Date/Time.
        np.ndarray: The raw FFT values.
    """
    n = len(data_series)
    t = np.arange(n)  # Time array in samples
    X = fft(data_series.values)  # Fourier transform

    # Calculate the frequency array. 'd' is the sample spacing.
    # For 1-minute resampling, d=1 minute. To convert to Hz, if data is daily, d=1 day.
    # Given this is minute data, we can define frequency in cycles per minute or adjust 'd' to seconds/hours/days.
    # Assuming 'd' is 1 minute for now, so frequencies are in cycles/minute.
    freq = fftfreq(n, d=1)

    amplitude = np.abs(X)
    phase = np.angle(X)

    ft_output = pd.DataFrame({
        'Frequency': freq,
        'Amplitude': amplitude,
        'Phase': phase,
        'Time': t,
        'Price': data_series.values,
        'Date/Time': data_series.index # Use the original datetime index
    })
    return ft_output, X

if not df_resampled.empty and 'EUR/USD' in df_resampled.columns:
    try:
        # Set 'Date' as index for the series before passing to FT function
        ft_output_df, full_fft_values = perform_fourier_transform(df_resampled.set_index('Date')['EUR/USD'])

        # Save the output to a new CSV file
        output_path = '/content/ft_output.csv'
        ft_output_df.to_csv(output_path, index=False)
        print(f"Fourier Transform output saved to '{output_path}'")
        print("\nFirst 5 rows of Fourier Transform Output DataFrame:")
        print(ft_output_df.head())
    except Exception as e:
        print(f"Error performing Fourier Transform: {e}")
else:
    print("Cannot perform Fourier Transform as 'df_resampled' is empty or 'EUR/USD' column is missing.")


## 4. Visualizing Fourier Transform Components on a Bloch Sphere

A Bloch Sphere is a geometric representation of the pure state space of a two-level quantum mechanical system. While typically used in quantum computing to visualize qubits, we adapt it here conceptually to represent the complex nature of our Fourier Transform components (frequency, amplitude, and phase). By mapping these properties to a 3D spherical coordinate system, we can gain a novel visual perspective on the relationships between different oscillatory patterns in the financial data. The frequency can be mapped to the radius, and the phase can be mapped to an angular coordinate (theta).

**Why it's useful:** This unconventional visualization helps to intuitively grasp how different frequency components, with their respective strengths and timings, contribute to the overall signal. In the long term, this conceptualization could pave the way for leveraging quantum computing principles to analyze complex financial data, where Bloch spheres might represent entire quantum functions or states related to market dynamics.

In [None]:
if 'ft_output_df' in locals() and not ft_output_df.empty:
    try:
        # Take the first 30 rows (or fewer if less data is available) for visualization
        ft_output_30 = ft_output_df.iloc[0:min(30, len(ft_output_df))].copy()

        # For visualization on a sphere, we'll map frequency to radius and phase to one of the angles.
        # We normalize the frequency to avoid extremely large spheres and fix phi for a clearer view of phase variations.
        r_max = ft_output_30['Frequency'].abs().max() # Max absolute frequency for scaling
        if r_max == 0: # Handle case where all frequencies are zero
            print("All frequencies are zero, cannot create Bloch Sphere visualization.")
        else:
            r = ft_output_30['Frequency'].abs() / r_max  # Scale frequency to a reasonable radius (0 to 1)
            theta = ft_output_30['Phase']  # Phase in radians
            phi = np.pi / 2  # Fixed angle for visualization, effectively plotting on the XY plane for clarity

            # Convert spherical to Cartesian coordinates
            # x = r * sin(phi) * cos(theta)
            # y = r * sin(phi) * sin(theta)
            # z = r * cos(phi)
            # Since phi is fixed at pi/2, sin(phi)=1 and cos(phi)=0, so z will be 0.
            x = r * np.cos(theta)
            y = r * np.sin(theta)
            z = np.zeros_like(r) # All points will be on the XY plane due to phi = pi/2

            # Create 3D plot for the vectors on a sphere
            fig = plt.figure(figsize=(10, 8))
            ax = fig.add_subplot(111, projection='3d')

            # Plot vectors from the origin
            for i in range(len(ft_output_30)):
                ax.quiver(0, 0, 0, x[i], y[i], z[i], color='b', arrow_length_ratio=0.1)

            # Add a sphere for context (optional, but good for visualization)
            u = np.linspace(0, 2 * np.pi, 100)
            v = np.linspace(0, np.pi, 100)
            sphere_x = np.outer(np.cos(u), np.sin(v))
            sphere_y = np.outer(np.sin(u), np.sin(v))
            sphere_z = np.outer(np.ones(np.size(u)), np.cos(v))
            ax.plot_surface(sphere_x, sphere_y, sphere_z, color='c', alpha=0.1, linewidth=0)

            # Set labels
            ax.set_xlabel('X (Scaled Frequency Cosine Phase)')
            ax.set_ylabel('Y (Scaled Frequency Sine Phase)')
            ax.set_zlabel('Z (Fixed)')

            # Set the limits to be symmetric around zero for a centered sphere
            max_range = 1.0 # Since radius is scaled to 0-1
            ax.set_xlim([-max_range, max_range])
            ax.set_ylim([-max_range, max_range])
            ax.set_zlim([-max_range, max_range])

            # Set the title
            ax.set_title('Fourier Transform Components on a Conceptual Sphere (First 30 Waves)')
            plt.tight_layout()
            plt.savefig('/content/bloch_sphere_representation.png')
            plt.show()
            print("Bloch Sphere representation plot saved as 'bloch_sphere_representation.png'")

    except Exception as e:
        print(f"Error generating Bloch Sphere representation: {e}")
else:
    print("Cannot generate Bloch Sphere representation as 'ft_output_df' is empty.")


## 5. Segmenting and Analyzing the Signal for Time-Dependent Patterns

To understand how the signal's frequency content evolves over time, we divide the entire dataset into multiple segments (e.g., 20 equal parts). Performing a Fourier Transform on each segment allows us to capture time-dependent elements and observe local variations in the dominant frequencies, amplitudes, and phases. This approach helps in identifying shifts in market behavior that might not be apparent from a single, global Fourier Transform.

**Why it's useful:** Financial markets are dynamic, and patterns can emerge or disappear over time. By segmenting the data, we gain a more granular view, which is essential for developing adaptive trading strategies. We expect to see different dominant frequencies and magnitudes in different segments, reflecting the non-stationary nature of financial time series.

In [None]:
def numerical_ft_segment(data_array):
    """Performs Fourier Transform on a single segment of data.

    Args:
        data_array (np.ndarray): The segment of time series data.

    Returns:
        pd.DataFrame: A DataFrame containing Frequency, Amplitude, and Phase for the segment.
    """
    n = len(data_array)
    F_values = fft(data_array)
    freq = fftfreq(n, d=1)  # Assuming 1-minute sample spacing
    return pd.DataFrame({
        'Frequency': freq,
        'Amplitude': np.abs(F_values),
        'Phase': np.angle(F_values)
    })

def process_data_segment(args):
    """Function to process a single segment of data for parallel execution.

    Args:
        args (tuple): A tuple containing start index, end index, and segment number.

    Returns:
        pd.DataFrame: Fourier Transform results for the segment with a 'Segment' column.
    """
    start, end, segment_num, df_data = args
    segment_data = df_data['EUR/USD'].iloc[start:end]
    ft_segment = numerical_ft_segment(segment_data.values)
    ft_segment['Segment'] = segment_num
    return ft_segment

if not df_resampled.empty:
    try:
        # Perform initial FT on entire dataset
        # This output will serve as the optimal basis waves for comparison later.
        print("Performing FT on entire dataset to establish optimal basis waves...")
        ft_full_optimal, _ = perform_fourier_transform(df_resampled.set_index('Date')['EUR/USD'])

        # Combine with original data (for completeness, though not directly used for correlation here)
        # The `ft_full_optimal` DataFrame already contains 'Frequency', 'Amplitude', 'Phase'
        # from the full dataset. We need to be careful with column names in final output.

        # Prepare segments for parallel processing
        num_segments = 20 # Chosen to allow for more time dependent elements to be extracted
        segment_size = len(df_resampled) // num_segments
        segments_args = []
        for i in range(num_segments):
            start = i * segment_size
            end = (i + 1) * segment_size if i < num_segments - 1 else len(df_resampled)
            segments_args.append((start, end, i + 1, df_resampled)) # Pass df_resampled to avoid global access in pool

        # Use multiprocessing to perform FT on segments for efficiency
        print(f"Performing FT on {num_segments} segments using multiprocessing...")
        # Pass df_resampled directly as an argument to process_data_segment
        with Pool(processes=cpu_count()) as pool:
            segment_fts = pool.map(process_data_segment, segments_args)

        # Combine all segment FTs
        all_segment_fts = pd.concat(segment_fts, ignore_index=True)

        # To create a 'final_output' similar to the original notebook's intent,
        # we need to align the full FT and segment FTs. This implies a row-wise concatenation,
        # which means that the number of rows in `ft_full_optimal` and `all_segment_fts`
        # needs to be the same, which they won't be unless the segment_fts are somehow aggregated.
        # The original notebook's concatenation would have produced a DataFrame with many NaN values
        # if the shapes didn't align or were not meaningful for concatenation.
        # For clarity and correct analysis, we will focus on the two main outputs:
        # 1. `ft_full_optimal`: The FT of the entire signal (optimal basis waves).
        # 2. `all_segment_fts`: The FTs of all individual segments.

        # Saving these two key outputs separately is more robust and meaningful.
        ft_full_optimal.to_csv('/content/fourier_transform_full_optimal.csv', index=False)
        all_segment_fts.to_csv('/content/fourier_transform_segments.csv', index=False)

        print("Processing complete. Fourier Transform results for full data and segments saved.")
        print("\nShape of full FT optimal DataFrame:", ft_full_optimal.shape)
        print("\nShape of all segments FT DataFrame:", all_segment_fts.shape)
        print("\nColumns in full FT optimal DataFrame:", ft_full_optimal.columns)
        print("\nColumns in all segments FT DataFrame:", all_segment_fts.columns)

    except Exception as e:
        print(f"Error during segmentation and FT analysis: {e}")
else:
    print("Cannot perform segmentation and FT analysis as 'df_resampled' is empty.")


## 6. Visualizing Segment Boundaries

To better understand the segmentation, we visually mark the boundaries of each segment on the original time series chart. This helps in correlating specific time periods with the Fourier Transform results of their respective segments.

**Why it's useful:** This visualization provides a clear reference for the time intervals corresponding to each segment's spectral analysis, allowing for a more intuitive interpretation of how market dynamics change over time.

In [None]:
if not df_resampled.empty:
    try:
        # Calculate the slice boundaries
        slice_size = len(df_resampled) // 20
        slice_boundaries = [df_resampled['Date'].iloc[i * slice_size] for i in range(1, 20)]
        slice_boundaries.append(df_resampled['Date'].iloc[-1])  # Add the last date

        # Create the plot
        plt.figure(figsize=(20, 10))

        # Plot the original price data
        plt.plot(df_resampled['Date'], df_resampled['EUR/USD'], color='blue', linewidth=0.5, label='EUR/USD')

        # Add red lines for each slice
        for boundary in slice_boundaries:
            plt.axvline(x=boundary, color='red', linestyle='--', linewidth=0.5)

        # Customize the plot
        plt.title('EUR/USD Exchange Rate Over Time with Segment Markers', fontsize=16)
        plt.xlabel('Year', fontsize=12)
        plt.ylabel('EUR/USD Exchange Rate', fontsize=12)

        # Format x-axis to show years
        years = mdates.YearLocator()
        years_fmt = mdates.DateFormatter('%Y')
        plt.gca().xaxis.set_major_locator(years)
        plt.gca().xaxis.set_major_formatter(years_fmt)

        # Rotate and align the tick labels so they look better
        plt.gcf().autofmt_xdate()

        # Add a legend
        plt.legend(['EUR/USD', 'Segment Boundaries'], loc='upper left')

        # Add grid for better readability
        plt.grid(True, linestyle=':', alpha=0.6)

        # Save the plot
        plt.savefig('/content/forex_timeseries_with_slices.png', dpi=300, bbox_inches='tight')
        plt.show()
        print("Chart has been saved as 'forex_timeseries_with_slices.png'")

        # Display some statistics about the slices
        print("\nSegment Information:")
        for i, boundary in enumerate(slice_boundaries):
            start_date = df_resampled['Date'].iloc[i * slice_size] if i * slice_size < len(df_resampled) else 'N/A'
            end_date = boundary
            print(f"Segment {i+1}: {start_date} to {end_date}")

    except Exception as e:
        print(f"Error visually charting segments: {e}")
else:
    print("Cannot chart segments as 'df_resampled' is empty.")


## 7. Comparing Segment-Specific Basis Waves to the Overall Signal

This section aims to quantify the similarity between the dominant oscillatory patterns (basis waves) found in each time segment and the overall dominant patterns of the entire EUR/USD signal. We achieve this by:

1.  **Extracting Optimal Basis Waves:** Identifying the 20 most significant frequency components from the Fourier Transform of the *entire* dataset. This set represents the 'average' or 'most important' cycles across the full timeframe.
2.  **Segment-wise Transform:** Performing Fourier Transform on each individual segment to get its unique set of dominant frequencies, amplitudes, and phases.
3.  **Variance Calculation:** Measuring the variance (or squared difference) between the corresponding optimal basis waves of the full signal and those of each segment. Low variance indicates a high degree of similarity, suggesting that the segment's behavior closely aligns with the overall market rhythm.

**Why it's useful:** This comparison helps us understand how consistent the market's behavior is over time. Segments with high variance might indicate periods of anomalous or unique market conditions, while segments with low variance suggest periods where the market adheres closely to its long-term characteristics. This information is critical for identifying specific periods conducive to certain trading strategies or for detecting shifts in market regimes.

In [None]:
def extract_top_n_ft_components(ft_data, n=20):
    """Extracts the top N most significant Fourier Transform components (excluding DC).

    Args:
        ft_data (pd.DataFrame): DataFrame from perform_fourier_transform.
        n (int): Number of top components to extract.

    Returns:
        tuple: (frequencies, magnitudes, phases) of the top N components.
        float: The DC component (mean).
    """
    # Make a copy to avoid modifying the original DataFrame or its underlying data
    temp_ft_data = ft_data.copy()

    # Store the DC component (Frequency = 0)
    dc_component_row = temp_ft_data[temp_ft_data['Frequency'] == 0]
    dc_amplitude = dc_component_row['Amplitude'].iloc[0] if not dc_component_row.empty else 0

    # Temporarily set DC component amplitude to 0 for sorting non-DC significant components
    temp_ft_data.loc[temp_ft_data['Frequency'] == 0, 'Amplitude'] = 0

    # Select the N most significant frequencies (based on amplitude)
    top_n_indices = temp_ft_data['Amplitude'].nlargest(n).index
    top_n_components = temp_ft_data.loc[top_n_indices]

    # Ensure consistency in output length by padding with zeros if not enough components
    if len(top_n_components) < n:
        missing_rows = n - len(top_n_components)
        pad_df = pd.DataFrame(0, index=range(missing_rows), columns=top_n_components.columns)
        top_n_components = pd.concat([top_n_components, pad_df], ignore_index=True)

    return (
        top_n_components['Frequency'].values[:n],
        top_n_components['Amplitude'].values[:n],
        top_n_components['Phase'].values[:n],
        dc_amplitude
    )

def calculate_transform_variance(optimal_freqs, optimal_mags, optimal_phases,
                                 segment_freqs, segment_mags, segment_phases):
    """Calculates variance between optimal and segment transform components.

    Args:
        optimal_freqs (np.ndarray): Frequencies from the full signal.
        optimal_mags (np.ndarray): Magnitudes from the full signal.
        optimal_phases (np.ndarray): Phases from the full signal.
        segment_freqs (np.ndarray): Frequencies from the segment.
        segment_mags (np.ndarray): Magnitudes from the segment.
        segment_phases (np.ndarray): Phases from the segment.

    Returns:
        dict: Dictionary of variance metrics.
    """
    # Ensure arrays have the same length for direct comparison by truncating to the smaller size
    min_len = min(len(optimal_freqs), len(segment_freqs))

    freq_var = np.var(optimal_freqs[:min_len] - segment_freqs[:min_len])
    mag_var = np.var(optimal_mags[:min_len] - segment_mags[:min_len])
    phase_var = np.var(optimal_phases[:min_len] - segment_phases[:min_len])

    avg_var = (freq_var + mag_var + phase_var) / 3

    return {
        'Frequency_Variance': freq_var,
        'Magnitude_Variance': mag_var,
        'Phase_Variance': phase_var,
        'Average_Variance': avg_var
    }

if 'ft_full_optimal' in locals() and not ft_full_optimal.empty and \
   'all_segment_fts' in locals() and not all_segment_fts.empty:
    try:
        # Extract optimal basis waves from the full signal (excluding DC component)
        optimal_frequencies, optimal_magnitudes, optimal_phases, dc_amplitude = \
            extract_top_n_ft_components(ft_full_optimal.copy(), n=20) # n=20 is a hyperparameter for analysis

        results_variance = []
        num_segments = 20 # Already defined in previous cell
        segment_size = len(df_resampled) // num_segments

        for i in range(num_segments):
            segment_num = i + 1
            start_idx = i * segment_size
            end_idx = start_idx + segment_size if i < num_segments - 1 else len(df_resampled)

            # Extract segment data for FFT calculation
            segment_data = df_resampled['EUR/USD'].iloc[start_idx:end_idx]

            # Perform FT on the segment
            ft_segment_df, _ = perform_fourier_transform(segment_data.reset_index(drop=True))

            # Extract top N components from the segment's FT
            segment_frequencies, segment_magnitudes, segment_phases, _ = \
                extract_top_n_ft_components(ft_segment_df.copy(), n=20)

            # Calculate variance metrics
            variance_metrics = calculate_transform_variance(
                optimal_frequencies, optimal_magnitudes, optimal_phases,
                segment_frequencies, segment_magnitudes, segment_phases
            )

            results_variance.append({
                'Segment': segment_num,
                'Start_Date': df_resampled['Date'].iloc[start_idx],
                'End_Date': df_resampled['Date'].iloc[end_idx - 1] if end_idx > 0 else 'N/A',
                **variance_metrics
            })

        results_variance_df = pd.DataFrame(results_variance)
        results_variance_df = results_variance_df.sort_values('Average_Variance')

        results_variance_df.to_csv('/content/segment_variance_results.csv', index=False)

        print("Analysis complete. Results saved to 'segment_variance_results.csv'")
        print("\nSummary Statistics of Segment Variance:")
        print(results_variance_df.describe())
        print("\nTop 5 best matching segments (least variance to optimal basis waves):")
        print(results_variance_df.head())
        print("\nTop 5 worst matching segments (most variance to optimal basis waves):")
        print(results_variance_df.tail())

    except Exception as e:
        print(f"Error in segment variance analysis: {e}")
else:
    print("Cannot perform segment variance analysis as necessary DataFrames are empty.")


## 8. Flattening the Reconstructed Wave and Isolating Oscillations

The reconstructed signal, built from the most significant Fourier components, captures the primary trends and strong cycles of the EUR/USD data. To specifically analyze the short-term fluctuations or 'oscillations' around this trend, we perform a 'flattening' transformation. This involves subtracting the reconstructed signal (which represents the underlying trend/major cycles) from the original signal. The result is a new series that ideally has a near-zero mean, representing the deviations from the dominant patterns.

**Why it's useful:** This process effectively filters out the main structural movements in the data, allowing us to focus solely on the 'noise' or 'residual' patterns. For investment applications, these isolated oscillations can be highly informative: they might represent short-term mean-reversion opportunities or high-frequency trading signals that are obscured by larger trends. The near-zero mean of these oscillations indicates a balanced distribution of price movements above and below the reconstructed trend, suggesting that the exchange rate is just as likely to revert upwards as downwards to its reconstructed 'equilibrium' over time.

In [None]:
if not df_resampled.empty and 'ft_full_optimal' in locals() and not ft_full_optimal.empty:
    try:
        n = len(df_resampled)
        prices = df_resampled['EUR/USD'].values

        # Reconstruct the signal using the top 20 significant frequencies from the full FT
        # First, ensure we have the full FFT values corresponding to ft_full_optimal
        _, full_fft_raw = perform_fourier_transform(df_resampled.set_index('Date')['EUR/USD'])

        # Extract top 20 indices (excluding DC) from the full FFT
        # Store the DC component separately
        dc_component = full_fft_raw[0]
        full_fft_raw_no_dc = full_fft_raw.copy()
        full_fft_raw_no_dc[0] = 0 # Temporarily zero out DC for sorting

        significance = np.abs(full_fft_raw_no_dc)
        top_20_indices = np.argsort(significance)[-20:]
        optimal_fft_values = np.zeros(n, dtype=complex)
        optimal_fft_values[top_20_indices] = full_fft_raw[top_20_indices] # Use original values for top 20

        # Add the DC component back into the reconstruction
        optimal_fft_values[0] = dc_component

        optimal_reconstruction = np.real(ifft(optimal_fft_values))

        # Plot the original vs reconstructed signal
        plt.figure(figsize=(20, 10))
        plt.plot(df_resampled['Date'], prices, label='Original', alpha=0.7)
        plt.plot(df_resampled['Date'], optimal_reconstruction, label='Reconstructed (Top 20 Waves)', alpha=0.7)
        plt.title('Original vs Reconstructed EUR/USD Exchange Rate')
        plt.xlabel('Date')
        plt.ylabel('EUR/USD')
        plt.legend()
        plt.savefig('/content/original_vs_reconstructed.png')
        plt.show()
        print("Original vs Reconstructed plot saved as 'original_vs_reconstructed.png'")

        # Calculate the transformation to flatten the reconstructed wave
        # We subtract the mean of the reconstructed signal to center it around zero
        flattening_transform = optimal_reconstruction - np.mean(optimal_reconstruction)

        # Apply the transformation to both the reconstructed and original waves
        # flattened_reconstruction should now be essentially zero
        flattened_reconstruction = optimal_reconstruction - flattening_transform
        flattened_original = prices - flattening_transform

        # Plot the flattened waves
        plt.figure(figsize=(20, 10))
        plt.plot(df_resampled['Date'], flattened_original, label='Transformed Original (Oscillations)', alpha=0.7)
        plt.plot(df_resampled['Date'], flattened_reconstruction, label='Flattened Reconstruction (Near Zero)', alpha=0.7, linestyle='--')
        plt.axhline(y=0, color='r', linestyle='--', label='X-axis')
        plt.title('Flattened EUR/USD Exchange Rate (Oscillations Isolated)')
        plt.xlabel('Date')
        plt.ylabel('Transformed EUR/USD')
        plt.legend()
        plt.savefig('/content/flattened_waves.png')
        plt.show()
        print("Flattened waves plot saved as 'flattened_waves.png'")

        # Calculate statistics about the flattened waves
        original_range = np.ptp(flattened_original)
        original_std = np.std(flattened_original)
        reconstruction_range = np.ptp(flattened_reconstruction)
        reconstruction_std = np.std(flattened_reconstruction)

        print("\nStatistics of the flattened waves:")
        print(f"Original (Oscillations) - Range: {original_range:.6f}, Standard Deviation: {original_std:.6f}")
        print(f"Reconstruction (Flattened) - Range: {reconstruction_range:.6f}, Standard Deviation: {reconstruction_std:.6f}")

        # Create a DataFrame with the results for later analysis
        eur_usd_analysis_results_df = pd.DataFrame({
            'Date': df_resampled['Date'],
            'Original_Price': prices,
            'Reconstructed_Price': optimal_reconstruction,
            'Flattened_Original': flattened_original,
            'Flattened_Reconstruction': flattened_reconstruction
        })

        eur_usd_analysis_results_df.to_csv('/content/eur_usd_analysis_results.csv', index=False)
        print("\nNumerical results saved to 'eur_usd_analysis_results.csv'")

        # Save statistics to a separate CSV
        stats_df = pd.DataFrame({
            'Metric': ['Range', 'Standard Deviation'],
            'Flattened_Original': [original_range, original_std],
            'Flattened_Reconstruction': [reconstruction_range, reconstruction_std]
        })
        stats_df.to_csv('/content/eur_usd_analysis_stats.csv', index=False)
        print("Statistics saved to 'eur_usd_analysis_stats.csv'")

    except Exception as e:
        print(f"Error during signal flattening: {e}")
else:
    print("Cannot flatten signal as 'df_resampled' or 'ft_full_optimal' is empty.")


## 9. Analyzing Oscillations: Autocorrelation and Cyclical Behavior

With the primary trends removed, we can now analyze the isolated oscillations for any underlying cyclical patterns or persistence. Autocorrelation is a key statistical tool that measures how a time series is correlated with a lagged version of itself. A high autocorrelation at a specific lag indicates a repeating pattern or cycle. We will also perform spectral analysis to identify dominant frequencies within these oscillations.

**Why it's useful:** Understanding the autocorrelation and spectral properties of these oscillations helps determine their predictability. If strong, statistically significant cycles are present, they could be exploited for short-term trading strategies. However, the absence of clear peaks might suggest a more complex, 'long-memory' process that requires advanced machine learning techniques, such as ARIMA or GARCH models, for accurate forecasting.

In [None]:
if 'eur_usd_analysis_results_df' in locals() and not eur_usd_analysis_results_df.empty:
    try:
        # Calculate the oscillations (difference between original and reconstruction)
        # This column was already created in the previous step as 'Flattened_Original'
        # For clarity, let's refer to it as 'Oscillations' in this context.
        oscillations = eur_usd_analysis_results_df['Flattened_Original']

        # 1. Basic Statistical Analysis of Oscillations
        print("\nBasic Statistical Analysis of Oscillations (Flattened Original):")
        print(oscillations.describe())

        # 2. Time Series Plot of Oscillations
        plt.figure(figsize=(20, 10))
        plt.plot(eur_usd_analysis_results_df['Date'], oscillations, label='Oscillations')
        plt.title('Oscillations of Original Signal around Flattened Reconstruction')
        plt.xlabel('Date')
        plt.ylabel('Oscillation Magnitude')
        plt.grid(True, linestyle=':', alpha=0.6)
        plt.legend()
        plt.tight_layout()
        plt.savefig('/content/oscillations_time_series.png')
        plt.show()
        print("Oscillations time series plot saved as 'oscillations_time_series.png'")

        # 3. Histogram of Oscillations
        plt.figure(figsize=(12, 6))
        plt.hist(oscillations, bins=50, edgecolor='black', alpha=0.7)
        plt.title('Distribution of Oscillations')
        plt.xlabel('Oscillation Magnitude')
        plt.ylabel('Frequency')
        plt.grid(True, linestyle=':', alpha=0.6)
        plt.tight_layout()
        plt.savefig('/content/oscillations_histogram.png')
        plt.show()
        print("Oscillations histogram plot saved as 'oscillations_histogram.png'")

        # 4. Autocorrelation Analysis
        lags_to_plot = 100 # Adjust as needed
        autocorr_values = acf(oscillations, nlags=lags_to_plot) # Compute autocorrelation function

        plt.figure(figsize=(15, 7))
        plot_acf(oscillations, lags=lags_to_plot, ax=plt.gca(), title='Autocorrelation of EUR/USD Oscillations')

        # Add vertical lines at multiples of 27 days
        # The original notebook stated a 27-day cycle was proven non-existent by auto-correlation.
        # These lines are for visual reference based on prior hypothesis.
        for i in range(1, int(lags_to_plot / 27) + 1):
            plt.axvline(x=i * 27, color='g', linestyle='--', alpha=0.5, label=f'{i*27} Days (Previous Hypothesis)')

        # Only add legend if lines are actually plotted
        if int(lags_to_plot / 27) >= 1:
            handles, labels = plt.gca().get_legend_handles_labels()
            unique_labels = dict(zip(labels, handles)) # Deduplicate labels
            plt.legend(unique_labels.values(), unique_labels.keys())

        plt.xlabel('Lag (days)')
        plt.ylabel('Autocorrelation Coefficient')
        plt.grid(True, linestyle=':', alpha=0.6)
        plt.tight_layout()
        plt.savefig('/content/oscillations_autocorrelation.png')
        plt.show()
        print("Autocorrelation plot saved as 'oscillations_autocorrelation.png'")

        print("\nAutocorrelation values around 27 days (Previous Hypothesis Reference):")
        for i in range(25, 30): # Checking around the 27-day mark
            if i < len(autocorr_values):
                print(f"Lag {i}: {autocorr_values[i]:.4f}")

        # 5. Power Spectral Density (PSD) Analysis
        # PSD helps identify the frequencies at which the signal power is concentrated.
        f, Pxx = signal.periodogram(oscillations)
        plt.figure(figsize=(12, 6))
        plt.semilogy(f, Pxx)
        plt.title('Power Spectral Density of Oscillations')
        plt.xlabel('Frequency (Cycles/Day)')
        plt.ylabel('Power (dB)')
        plt.grid(True, linestyle=':', alpha=0.6)
        plt.tight_layout()
        plt.savefig('/content/oscillations_psd.png')
        plt.show()
        print("Power Spectral Density plot saved as 'oscillations_psd.png'")

        # 6. Rolling Statistics of Oscillations
        # Rolling mean and standard deviation help identify changes in the central tendency and volatility of oscillations over time.
        window = 30  # 30-day window is a common choice for monthly trends.
        eur_usd_analysis_results_df['Rolling_Mean'] = oscillations.rolling(window=window).mean()
        eur_usd_analysis_results_df['Rolling_Std'] = oscillations.rolling(window=window).std()

        plt.figure(figsize=(20, 10))
        plt.plot(eur_usd_analysis_results_df['Date'], oscillations, label='Oscillations', alpha=0.7)
        plt.plot(eur_usd_analysis_results_df['Date'], eur_usd_analysis_results_df['Rolling_Mean'], label=f'{window}-Day Rolling Mean of Oscillations', color='orange')
        plt.plot(eur_usd_analysis_results_df['Date'], eur_usd_analysis_results_df['Rolling_Std'], label=f'{window}-Day Rolling Std Dev of Oscillations', color='green')
        plt.title(f'Rolling Statistics (Mean and Standard Deviation) of EUR/USD Oscillations (Window={window} Days)')
        plt.xlabel('Date')
        plt.ylabel('Magnitude')
        plt.grid(True, linestyle=':', alpha=0.6)
        plt.legend()
        plt.tight_layout()
        plt.savefig('/content/oscillations_rolling_stats.png')
        plt.show()
        print("Rolling statistics plot saved as 'oscillations_rolling_stats.png'")

        # 7. Identifying Potential Cycles (Peaks)
        # We identify peaks in the oscillation signal to calculate average distances between them.
        # A minimum distance of 20 days is set to avoid capturing very high-frequency noise as distinct peaks.
        peaks, _ = find_peaks(oscillations, distance=20)

        plt.figure(figsize=(20, 10))
        plt.plot(eur_usd_analysis_results_df['Date'], oscillations, label='Oscillations')
        plt.plot(eur_usd_analysis_results_df['Date'].iloc[peaks], oscillations.iloc[peaks], "x", color='red', markersize=8, label='Identified Peaks')
        plt.title('Identified Peaks in EUR/USD Oscillations')
        plt.xlabel('Date')
        plt.ylabel('Oscillation Magnitude')
        plt.grid(True, linestyle=':', alpha=0.6)
        plt.legend()
        plt.tight_layout()
        plt.savefig('/content/oscillations_peaks.png')
        plt.show()
        print("Identified peaks plot saved as 'oscillations_peaks.png'")

        # Calculate average distance between peaks
        peak_distances = np.diff(peaks)
        if len(peak_distances) > 0:
            avg_peak_distance = np.mean(peak_distances)
            print(f"\nAverage distance between peaks: {avg_peak_distance:.2f} days")
        else:
            print("\nNo sufficient peaks found to calculate average distance.")

        # Save the full oscillation analysis results
        eur_usd_analysis_results_df.to_csv('/content/eur_usd_oscillation_analysis.csv', index=False)
        print("\nOscillation analysis results saved to 'eur_usd_oscillation_analysis.csv'")
        print("All analysis plots have been saved as PNG files in the current directory.")

    except Exception as e:
        print(f"Error during oscillation analysis: {e}")
else:
    print("Cannot perform oscillation analysis as 'eur_usd_analysis_results_df' is empty.")


## 10. Confirming Mean Distribution of Flattened Signal

After flattening the signal by subtracting the reconstructed trend, it's essential to verify that the resulting oscillations truly have a near-zero mean. This confirms that the trend has been effectively removed and that the oscillations are indeed distributed symmetrically around zero. This is a critical validation step to ensure the integrity of the isolated oscillatory component.

**Why it's useful:** A near-zero mean in the oscillations indicates that positive and negative deviations from the trend are approximately balanced. This is a characteristic feature of 'noise' or 'residual' components in time series and suggests that the original signal is just as likely to move above its reconstructed 'equilibrium' as below it. This balanced nature is valuable for identifying mean-reversion trading opportunities, where the price tends to return to its average over time.

In [None]:
if 'eur_usd_analysis_results_df' in locals() and not eur_usd_analysis_results_df.empty:
    try:
        # The 'Flattened_Reconstruction' column should be near zero, and 'Flattened_Original' represents the oscillations
        # The original notebook's intent was to check: `Reconstructed_Price - Flattened_Reconstruction`
        # which effectively checks how close `Flattened_Reconstruction` is to `Reconstructed_Price`.
        # However, `Flattened_Reconstruction` should already be close to zero if `optimal_reconstruction - np.mean(optimal_reconstruction)` was the target.
        # Let's clarify: if 'Flattened_Reconstruction' is the reconstructed wave *minus its own mean* (as per code),
        # then its mean should be near zero.
        # The user's prompt indicated 'Reconstructed_Price - Flattened_Reconstruction' as the difference.
        # If `Flattened_Reconstruction` is the *original price* minus the *flattening_transform*,
        # then `Flattened_Reconstruction` is effectively the *mean of the reconstructed price*.
        # Let's re-verify the definition from the previous cell:
        # `flattened_reconstruction = optimal_reconstruction - flattening_transform`
        # `flattening_transform = optimal_reconstruction - np.mean(optimal_reconstruction)`
        # So, `flattened_reconstruction = optimal_reconstruction - (optimal_reconstruction - np.mean(optimal_reconstruction))`
        # `flattened_reconstruction = np.mean(optimal_reconstruction)` (a constant value)

        # This means the 'Flattened_Reconstruction' is indeed a constant: the mean of the original optimal reconstruction.
        # The actual oscillations are in 'Flattened_Original' because `flattened_original = prices - flattening_transform`.
        # And `flattening_transform` effectively centers `optimal_reconstruction` at zero.

        # So, the correct data to check the mean of oscillations is 'Flattened_Original'.
        oscillations_to_verify = eur_usd_analysis_results_df['Flattened_Original']

        total_sum_oscillations = np.sum(oscillations_to_verify)
        mean_oscillations = np.mean(oscillations_to_verify)
        std_oscillations = np.std(oscillations_to_verify)

        print(f"\nVerifying the mean of the isolated oscillations ('Flattened_Original'):")
        print(f"Sum of oscillations: {total_sum_oscillations}")
        print(f"Number of data points: {len(oscillations_to_verify)}")
        print(f"Mean of the oscillations: {mean_oscillations}")
        print(f"Standard deviation of the oscillations: {std_oscillations}")

        # Also confirm the mean of the 'Flattened_Reconstruction' is indeed the mean of the original reconstruction
        constant_flattened_reconstruction = eur_usd_analysis_results_df['Flattened_Reconstruction'].iloc[0]
        original_reconstruction_mean = np.mean(eur_usd_analysis_results_df['Reconstructed_Price'])
        print(f"\nValue of 'Flattened_Reconstruction' (should be constant): {constant_flattened_reconstruction:.6f}")
        print(f"Mean of original 'Reconstructed_Price': {original_reconstruction_mean:.6f}")
        print("This confirms 'Flattened_Reconstruction' holds the mean of the 'Reconstructed_Price', effectively shifting the original data down by that mean for oscillation analysis.")

    except Exception as e:
        print(f"Error during mean distribution confirmation: {e}")
else:
    print("Cannot confirm mean distribution as 'eur_usd_analysis_results_df' is empty.")


## 11. Detecting Aggressive Price Movements and Segment Analysis

Identifying 'aggressive' price movements (periods of unusually high volatility or large returns) is crucial for risk management and opportunistic trading. We detect these movements by comparing daily returns to a rolling standard deviation, flagging instances where returns exceed a certain threshold (e.g., 2 standard deviations). Subsequently, we analyze the characteristics of the Fourier and Wavelet Transforms around these aggressive movements.

**Why it's useful:** Understanding the underlying frequency and temporal characteristics of aggressive movements can help predict their onset or inform trading strategies during volatile periods. Comparing Wavelet and Fourier Transforms in these specific contexts allows us to investigate which transform provides a more stable or informative representation during high-impact events. This analysis is intended to analyze their use in this specific case, aiming to determine if there is enough signal in the data set to apply advanced ML techniques.

In [None]:
def detect_aggressive_movements(returns, window=20, threshold=2):
    """Detects aggressive price movements based on rolling standard deviation.

    Args:
        returns (pd.Series): Time series of financial returns.
        window (int): Rolling window size for standard deviation calculation.
        threshold (int): Multiplier for standard deviation to set the aggressive movement threshold.

    Returns:
        pd.Series: Boolean Series indicating aggressive movements.
    """
    rolling_std = returns.rolling(window=window).std()
    # Aggressive moves are where absolute returns exceed the threshold times rolling std
    aggressive_moves = np.abs(returns) > (threshold * rolling_std)
    return aggressive_moves

def wavelet_transform_segment(data_array, wavelet='db4', level=None):
    """Performs Discrete Wavelet Transform on a data segment.

    Args:
        data_array (np.ndarray): The segment of time series data.
        wavelet (str): Name of the wavelet to use (e.g., 'db4').
        level (int, optional): Decomposition level. Defaults to min(5, max_level).

    Returns:
        list: List of wavelet coefficients.
    """
    if len(data_array) < pywt.dwt_coeff_len(len(data_array), pywt.Wavelet(wavelet).dec_len):
        # Pad the data if too short for the chosen wavelet and level
        pad_len = pywt.dwt_coeff_len(len(data_array), pywt.Wavelet(wavelet).dec_len) - len(data_array)
        data_array = np.pad(data_array, (0, pad_len), 'constant')

    # Automatically determine the maximum decomposition level
    max_level = pywt.dwt_max_level(len(data_array), wavelet)
    actual_level = min(5, max_level) if level is None else min(level, max_level)

    # Ensure actual_level is at least 1 if possible
    if actual_level == 0 and max_level > 0:
        actual_level = 1

    coeffs = pywt.wavedec(data_array, wavelet, level=actual_level)
    return coeffs

def fourier_transform_segment(data_array):
    """Performs Fourier Transform on a data segment.

    Args:
        data_array (np.ndarray): The segment of time series data.

    Returns:
        np.ndarray: Raw FFT values.
    """
    return fft(data_array)

def compare_normalized_transforms(wavelet_coeffs, fourier_coeffs):
    """Compares normalized wavelet and Fourier transforms using correlation.

    Args:
        wavelet_coeffs (list): List of wavelet coefficients.
        fourier_coeffs (np.ndarray): Raw Fourier Transform values.

    Returns:
        float: Correlation coefficient between the normalized transforms.
    """
    # Flatten and take absolute values for comparison (magnitudes)
    # Handle potential empty or single-element arrays from wavedec
    if len(wavelet_coeffs) == 0: # Check if wavelet_coeffs is empty
        normalized_wavelet = np.array([0.0]) # Use a placeholder to avoid error
    else:
        # np.concatenate can return a 0-D array if all elements are empty or 0-D
        concatenated_wavelet = np.concatenate([c.flatten() for c in wavelet_coeffs if c.size > 0])
        if concatenated_wavelet.size == 0:
            normalized_wavelet = np.array([0.0])
        else:
            # Reshape to 2D for StandardScaler
            normalized_wavelet = StandardScaler().fit_transform(concatenated_wavelet.reshape(-1, 1)).flatten()

    # Ensure fourier_coeffs is 1D and has at least one element
    fourier_magnitudes = np.abs(fourier_coeffs).flatten()
    if fourier_magnitudes.size == 0:
        normalized_fourier = np.array([0.0])
    else:
        # Reshape to 2D for StandardScaler
        normalized_fourier = StandardScaler().fit_transform(fourier_magnitudes.reshape(-1, 1)).flatten()

    # Ensure both arrays have the same length for correlation calculation
    min_len = min(len(normalized_wavelet), len(normalized_fourier))
    if min_len == 0: # If min_len is 0, correlation is undefined, return NaN
        return np.nan
    elif min_len == 1: # If min_len is 1, correlation is 1 if values are identical, else 0
        return 1.0 if np.isclose(normalized_wavelet[0], normalized_fourier[0]) else 0.0
    else:
        correlation = np.corrcoef(normalized_wavelet[:min_len], normalized_fourier[:min_len])[0, 1]
        return correlation

if not df_resampled.empty and 'Returns' in df_resampled.columns:
    try:
        # Detect aggressive movements (adjust window and threshold as needed)
        # A threshold of 2 standard deviations is a common statistical heuristic for outliers.
        df_resampled['Aggressive_Move'] = detect_aggressive_movements(df_resampled['Returns'], window=20, threshold=2)

        # Analyze each aggressive movement
        aggressive_movement_results = []
        pad = 50  # Padding before and after aggressive movement for context in the segment

        num_data_points = len(df_resampled)
        num_segments_analysis = 20 # Keep consistent with prior segmentation
        segment_len_for_index = num_data_points // num_segments_analysis

        # Iterate through the DataFrame to find aggressive moves
        for i in range(num_data_points):
            if df_resampled['Aggressive_Move'].iloc[i]:
                start_segment = max(0, i - pad)
                end_segment = min(num_data_points, i + pad + 1)

                # Extract data around the aggressive movement
                segment_data = df_resampled['EUR/USD'].iloc[start_segment:end_segment].values

                # Ensure segment_data is not empty and has enough points for transforms
                if len(segment_data) > 1: # FFT/Wavelet need at least 2 points
                    try:
                        # Perform wavelet transform
                        wavelet_coeffs = wavelet_transform_segment(segment_data)

                        # Perform Fourier transform
                        fourier_coeffs = fourier_transform_segment(segment_data)

                        # Compare transforms
                        correlation = compare_normalized_transforms(wavelet_coeffs, fourier_coeffs)

                        # Find which of the 20 segments this aggressive movement belongs to
                        # Handle edge case for the last segment index
                        segment_index = min(num_segments_analysis, (i // segment_len_for_index) + 1)

                        aggressive_movement_results.append({
                            'Date': df_resampled.index[i],
                            'Segment': segment_index,
                            'Correlation': correlation
                        })
                    except Exception as transform_e:
                        print(f"Warning: Could not process segment for date {df_resampled.index[i]} due to transform error: {transform_e}")
                else:
                    print(f"Warning: Segment too short for analysis at date {df_resampled.index[i]}")

        # Create DataFrame with results
        aggressive_movement_analysis_df = pd.DataFrame(aggressive_movement_results)

        # Analyze patterns in each of the 20 broader segments
        segment_correlation_analysis = []
        for segment_num in range(1, num_segments_analysis + 1):
            segment_results = aggressive_movement_analysis_df[aggressive_movement_analysis_df['Segment'] == segment_num]
            avg_correlation = segment_results['Correlation'].mean() if not segment_results.empty else np.nan
            num_aggressive_movements = len(segment_results)

            segment_correlation_analysis.append({
                'Segment': segment_num,
                'Avg_Correlation': avg_correlation,
                'Num_Aggressive_Movements': num_aggressive_movements
            })

        segment_correlation_analysis_df = pd.DataFrame(segment_correlation_analysis)

        # Save results
        aggressive_movement_analysis_df.to_csv('/content/aggressive_movement_analysis.csv', index=False)
        segment_correlation_analysis_df.to_csv('/content/segment_pattern_analysis.csv', index=False)

        # Plot results (Correlation vs Date for aggressive movements)
        plt.figure(figsize=(15, 10))
        plt.scatter(aggressive_movement_analysis_df['Date'], aggressive_movement_analysis_df['Correlation'], alpha=0.5)
        plt.title('Correlation between Wavelet and Fourier Transforms for Aggressive Movements')
        plt.xlabel('Date')
        plt.ylabel('Correlation')
        plt.grid(True, linestyle=':', alpha=0.6)
        plt.tight_layout()
        plt.savefig('/content/correlation_plot.png')
        plt.show()
        print("Analysis complete. Results saved to CSV files and correlation plot generated.")

        # Print summary
        print("\nSummary of Aggressive Movements by Segment (Sorted by Number of Movements):")
        print(segment_correlation_analysis_df.sort_values('Num_Aggressive_Movements', ascending=False))

        print("\nSegments with lowest average correlation (potential unique patterns during aggressive moves):")
        print(segment_correlation_analysis_df.sort_values('Avg_Correlation').head())

    except Exception as e:
        print(f"Error in aggressive movement detection and analysis: {e}")
else:
    print("Cannot detect aggressive movements as 'df_resampled' or 'Returns' column is empty.")


## Key Takeaways for our Client

This research initiative has provided critical insights into the oscillatory behavior of the EUR/USD exchange rate, forming a robust foundation for applying advanced AI/ML techniques in algorithmic trading strategy development.

* **Data Integrity and Preparation:** We successfully processed and resampled one year of minute-level EUR/USD data, ensuring a clean and uniform time series for Fourier Transform analysis. The initial charting revealed macroscopic trends and volatility, setting the stage for deeper spectral decomposition.

* **Fourier Transform Reveals Underlying Rhythms:** The Fourier Transform effectively decomposed the price signal into its constituent frequencies, providing valuable information on amplitude and phase. This transformation is fundamental for identifying recurring patterns that could drive price movements. The conceptual visualization on a Bloch Sphere provides a novel way to interpret these complex components, potentially opening doors for quantum computing applications in market analysis.

* **Isolation of Oscillations for Targeted Analysis:** By flattening the reconstructed signal (removing the primary trends captured by significant Fourier components), we successfully isolated the short-term oscillations. Statistical validation confirmed these oscillations exhibit a near-zero mean, indicating a balanced distribution around the reconstructed equilibrium. This suggests opportunities for mean-reversion strategies, as the price is as likely to revert upwards as downwards to its trend.

* **Complex Cyclical Structure, Not Simple Rhythms:** While initial peak analysis suggested a potential 27-day cycle in oscillations (consistent with monthly economic drivers), further autocorrelation analysis revealed a complex, long-memory structure rather than a simple, dominant fixed-period cycle. This indicates that while past movements do influence future values for an extended period, a straightforward 27-day pattern is not a primary, statistically significant feature. This outcome is crucial as it informs us that simple, fixed-period trading strategies are unlikely to be effective and that more sophisticated, adaptive models will be required.

* **Foundation for Advanced ML Techniques:** The analysis confirms that there is sufficient signal in the dataset to apply advanced machine learning techniques, some of which leverage Fourier and Wavelet Transforms. The intricate patterns observed in the oscillations, particularly their long-lasting correlations and the nuanced insights from comparing Fourier and Wavelet transforms during aggressive price movements, highlight the necessity for models capable of discerning complex, non-linear relationships. This work sets the stage for developing cutting-edge predictive models and sophisticated algorithmic trading strategies for your investment firm.

apoth3osis R&D remains committed to pioneering solutions at the intersection of AI/ML and financial markets, transforming complex data into actionable insights for your investment success.