# PSO Implementation and Results Comparison between My PSOs and PySwarms

* In this notebook, the results of My PSOs and PySwarms are compared for different optimization functions (Rosenbrock, Ackley and Rastrigin).
* For each function, the best parameter configurations are analyzed and the final costs and execution times are compared.
* Convergence plots are generated and stability statistics are analyzed.
* Box plots are generated to visualize the distribution of costs and execution times.
* Scatter plots are generated to compare the final cost and execution time between both methods.


In [1]:
# Importing libraries
from IPython.display import display
import pandas as pd
import matplotlib.pyplot as plt
import ast
%matplotlib inline
import re

# Load the CSV results file
csv_file = 'pso_results.csv'
expected_columns = ['function', 'method', 'n_particles', 'iters', 'w', 'c1', 'c2', 'dim', 'cost', 'execution_time', 'cost_history']
try:
    df = pd.read_csv(csv_file)
    print(f"Successfully loaded {csv_file}")
except FileNotFoundError:
    print(f"Error: The file {csv_file} was not found in the current directory.")
    print("Please make sure the CSV file is present or update the path.")
    print("Displaying analysis with an empty DataFrame as a placeholder.")
    df = pd.DataFrame(columns=expected_columns)

functions = ['ackley', 'rastrigin', 'rosenbrock']
methods = df['method'].unique() if not df.empty else []
for func in functions:
    if not df.empty:
        func_df = df[df['function'] == func]
        if not func_df.empty:
            best_results = func_df.sort_values(by='cost').head(5)
            print(f"\nTop 5 configurations for {func}:")
            display(best_results[['method', 'n_particles', 'iters', 'w', 'c1', 'c2', 'dim', 'cost', 'execution_time']])
        else:
            print(f"\nNo data found for function {func}.")
    else:
        print(f"\nDataFrame is empty. Cannot show top configurations for {func}.")

def extract_floats(x):
    if isinstance(x, str):
        try:
            # Replace "np.float64" and extract the float values
            x_cleaned = re.sub(r'np\.float64\((.*?)\)', r'\1', x)
            return [float(val) for val in ast.literal_eval(x_cleaned)]
        except (ValueError, SyntaxError) as e:
            # Handle the case where conversion fails
            print(f"Error at converting cost_history: {e}, value: {x}")
            return []
    return x if isinstance(x, list) else []

if 'cost_history' in df.columns:
    df['cost_history'] = df['cost_history'].apply(extract_floats)
else:
    print("\n'cost_history' column not found in DataFrame. Skipping conversion.")


Error: The file pso_results.csv was not found in the current directory.
Please make sure the CSV file is present or update the path.
Displaying analysis with an empty DataFrame as a placeholder.

DataFrame is empty. Cannot show top configurations for ackley.

DataFrame is empty. Cannot show top configurations for rastrigin.

DataFrame is empty. Cannot show top configurations for rosenbrock.


## Convergence Plots

The following plots show the convergence of the best configurations for each method (AsyncPSO, OpenMP_PSO, and PySwarms) on the Ackley, Rastrigin, and Rosenbrock functions. The y-axis (cost) is on a logarithmic scale to better visualize the convergence towards lower values.

In [2]:
# Plotting convergence comparison for the best configurations
# Obtaining function names
if not df.empty:
    for func in functions:
        # Filter the DataFrame for the current function and get the best configurations
        func_df = df[df['function'] == func]
        if func_df.empty:
            print(f"Skipping convergence plot for {func}: No data.")
            continue
        
        best_async_series = func_df[func_df['method'] == 'async'].sort_values(by='cost')
        best_pyswarms_series = func_df[func_df['method'] == 'pyswarms'].sort_values(by='cost')
        best_openmp_series = func_df[func_df['method'] == 'openmp'].sort_values(by='cost')
        
        if best_async_series.empty or best_pyswarms_series.empty or best_openmp_series.empty:
            print(f"Skipping convergence plot for {func}: Missing data for one or more methods.")
            continue
            
        best_async = best_async_series.iloc[0]
        best_pyswarms = best_pyswarms_series.iloc[0]
        best_openmp = best_openmp_series.iloc[0]

        # Set the cost history for both methods
        min_length = min(len(best_async['cost_history']),len(best_pyswarms['cost_history']), len(best_openmp['cost_history']))
        my_async_history = best_async['cost_history'][:min_length]
        my_openmp_history = best_openmp['cost_history'][:min_length]
        pyswarms_history = best_pyswarms['cost_history'][:min_length]

        # Plotting the convergence comparison
        plt.figure(figsize=(10, 6))
        plt.plot(pyswarms_history, label='PySwarms (Best)', linestyle='--', color='orange', linewidth=2)
        plt.plot(my_async_history, label='AsyncPSO (Best)', linestyle='-', color='blue', linewidth=2)
        plt.plot(my_openmp_history, label='OpenMP_PSO (Best)', linestyle='-', color='green', linewidth=2)

        plt.yscale('log')
        plt.ylim(bottom=max(1e-6, min(pyswarms_history + my_async_history + my_openmp_history) / 10), top=max(pyswarms_history + my_async_history + my_openmp_history) * 10 if pyswarms_history and my_async_history and my_openmp_history else 10) # Adjusted ylim
        plt.xlabel('Iteration')
        plt.ylabel('Cost (log scale)')
        plt.title(f'Convergence Comparison: ({func.capitalize()} Function, Best Configurations)')
        plt.grid(True, which="both", linestyle="--", alpha=0.7)
        plt.legend()
        plt.show()
else:
    print("DataFrame is empty. Cannot generate convergence plots.")

DataFrame is empty. Cannot generate convergence plots.


## Cost and Execution Time Distributions

The following box plots visualize the distribution of final costs and execution times across different methods, grouped by the number of particles and problem dimensionality, respectively. This helps in understanding the performance and efficiency trade-offs.

In [3]:
import seaborn as sns
# Boxplot for cost and execution time of the different methods (My PSO and PySwarms)
if not df.empty:
    plt.figure(figsize=(12, 7)) # Increased figure size for better readability
    sns.boxplot(x='n_particles', y='cost', hue='method', data=df)
    plt.yscale('log')
    plt.title('Final Cost Distribution by Number of Particles and Method (All Functions)') # Clarified title
    plt.xlabel('Number of Particles') # Added x-axis label
    plt.ylabel('Final Cost (log scale)') # Added y-axis label
    plt.legend(title='Method') # Added legend title
    plt.grid(True, which="both", linestyle="--", alpha=0.5) # Added subtle grid
    plt.show()

    plt.figure(figsize=(12, 7)) # Increased figure size
    sns.boxplot(x='dim', y='execution_time', hue='method', data=df)
    plt.yscale('log') # Using log scale for execution time as well, if it varies a lot
    plt.title('Execution Time Distribution by Dimension and Method (All Functions)') # Clarified title
    plt.xlabel('Dimension') # Added x-axis label
    plt.ylabel('Execution Time (log scale)') # Added y-axis label
    plt.legend(title='Method') # Added legend title
    plt.grid(True, which="both", linestyle="--", alpha=0.5) # Added subtle grid
    plt.show()
    
    # Suggestion: Add separate plots per function if needed
    # for func_name in df['function'].unique():
    #     plt.figure(figsize=(12, 7))
    #     sns.boxplot(x='n_particles', y='cost', hue='method', data=df[df['function']==func_name])
    #     plt.yscale('log')
    #     plt.title(f'Final Cost Distribution for {func_name.capitalize()} by N Particles and Method')
    #     plt.xlabel('Number of Particles')
    #     plt.ylabel('Final Cost (log scale)')
    #     plt.legend(title='Method')
    #     plt.grid(True, which="both", linestyle="--", alpha=0.5)
    #     plt.show()
else:
    print("DataFrame is empty. Cannot generate box plots.")

DataFrame is empty. Cannot generate box plots.


## Scatter Plots (Placeholder)

This section is intended for scatter plots comparing final cost and execution time between the different PSO methods. This type of plot can help visualize trade-offs, e.g., if one method is faster but achieves a slightly worse cost, or vice-versa.

**TODO:** Add scatter plots here if `pso_results.csv` is available. Example:
```python
# import seaborn as sns
# import matplotlib.pyplot as plt
# if not df.empty:
#     plt.figure(figsize=(10, 6))
#     sns.scatterplot(data=df, x='execution_time', y='cost', hue='method', style='function', size='n_particles')
#     plt.xscale('log')
#     plt.yscale('log')
#     plt.title('Cost vs. Execution Time by Method, Function, and Particle Size')
#     plt.xlabel('Execution Time (log scale)')
#     plt.ylabel('Final Cost (log scale)')
#     plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
#     plt.grid(True, which="both", ls="--", alpha=0.7)
#     plt.tight_layout()
#     plt.show()
# else:
#     print("DataFrame is empty. Cannot generate scatter plot.")
```

In [4]:
# This cell is intentionally left for implementing scatter plots as described above.

## Summary and Conclusions (Placeholder)

This section should summarize the key findings from the analysis.

**Based on the (hypothetical) results, one might conclude:**
* Which PSO method (AsyncPSO, OpenMP_PSO, PySwarms) generally performs best for the tested functions in terms of final cost.
* Which method is most computationally efficient (lowest execution time).
* How parameters like the number of particles and problem dimensionality affect performance and efficiency.
* Whether there are specific methods that excel on particular types of functions.

**TODO:** Add a detailed summary here once the notebook can be fully executed with `pso_results.csv`.