# Parallel Property Flash Calculations with NeqSim

This notebook demonstrates how to use NeqSim's parallel property flash calculations from Python using direct Java access via JPype. The parallel implementation leverages multiple CPU cores to dramatically speed up batch thermodynamic calculations.

## Features Covered

1. **Direct Java Access** - Full control using `jneqsim` module
2. **Sequential vs Parallel Comparison** - Performance benchmarking
3. **PT, PH, PS Flash Calculations** - All supported flash modes
4. **Varying Compositions** - Online fraction support for each calculation point
5. **Batch Processing** - Optimal handling of very large datasets

## Prerequisites

- Python 3.8+
- NeqSim Python package (`pip install neqsim`)
- Java 8+ installed and configured

## 1. Import Required Libraries

We import NeqSim and access Java classes directly through the `jneqsim` module. This provides full control over the Java API.

In [2]:
# Import neqsim for direct Java access
from neqsim import jneqsim

# Python standard libraries
import time
import numpy as np
import pandas as pd

# For visualization
import matplotlib.pyplot as plt

print(f"NeqSim loaded successfully")
print(f"Available processors: {jneqsim.util.NeqSimThreadPool.getDefaultPoolSize()}")

NeqSim loaded successfully
Available processors: 16


## 2. Create Fluid System Using Direct Java Access

We create a natural gas fluid system using the SRK equation of state. The `SystemSrkEos` class provides access to the Soave-Redlich-Kwong cubic equation of state.

In [3]:
# Create a natural gas fluid using SRK equation of state
# Initial conditions: 25°C, 50 bar
gas = jneqsim.thermo.system.SystemSrkEos(273.15 + 25.0, 50.0)

# Add components with mole fractions
gas.addComponent("methane", 0.85)
gas.addComponent("ethane", 0.08)
gas.addComponent("propane", 0.04)
gas.addComponent("n-butane", 0.02)
gas.addComponent("CO2", 0.01)

# Set mixing rule (classic van der Waals mixing rules)
gas.setMixingRule("classic")

# Create ThermodynamicOperations object for flash calculations
ops = jneqsim.thermodynamicoperations.ThermodynamicOperations(gas)

print("Fluid created successfully!")
print(f"Number of components: {gas.getNumberOfComponents()}")
print(f"Component names: {list(gas.getComponentNames())}")

Fluid created successfully!
Number of components: 5
Component names: ['methane', 'ethane', 'propane', 'n-butane', 'CO2']


## 3. Prepare Test Data

We'll generate pressure-temperature points to flash. This simulates real-world scenarios like processing time-series sensor data or generating property tables.

In [4]:
# Number of calculation points
num_points = 100

# Generate pressure and temperature arrays as Python lists
# Pressures: 10 to 100 bar
# Temperatures: 250 to 350 K
pressures = list(np.linspace(10.0, 100.0, num_points))
temperatures = list(np.linspace(250.0, 350.0, num_points))

print(f"Generated {num_points} test points")
print(f"Pressure range: {pressures[0]:.1f} - {pressures[-1]:.1f} bar")
print(f"Temperature range: {temperatures[0]:.1f} - {temperatures[-1]:.1f} K")

Generated 100 test points
Pressure range: 10.0 - 100.0 bar
Temperature range: 250.0 - 350.0 K


## 4. Sequential vs Parallel Performance Comparison

Now we'll compare the execution time of sequential (`propertyFlash`) vs parallel (`propertyFlashParallel`) calculations.

### Flash Modes
- **Mode 1 (PT)**: Pressure-Temperature flash
- **Mode 2 (PH)**: Pressure-Enthalpy flash  
- **Mode 3 (PS)**: Pressure-Entropy flash

In [5]:
# Warm-up run (JIT compilation)
_ = ops.propertyFlash(pressures, temperatures, 1, None, None)
_ = ops.propertyFlashParallel(pressures, temperatures, 1, None, None)

# Sequential PT flash
start_seq = time.time()
result_seq = ops.propertyFlash(pressures, temperatures, 1, None, None)
time_seq = time.time() - start_seq

# Parallel PT flash (using all CPU cores)
start_par = time.time()
result_par = ops.propertyFlashParallel(pressures, temperatures, 1, None, None)
time_par = time.time() - start_par

# Calculate speedup
speedup = time_seq / time_par

print(f"=== Performance Results for {num_points} PT Flash Calculations ===")
print(f"Sequential execution time: {time_seq*1000:.2f} ms")
print(f"Parallel execution time:   {time_par*1000:.2f} ms")
print(f"Speedup: {speedup:.2f}x")
print(f"\nErrors in sequential: {sum(1 for e in result_seq.calculationError if e is not None)}")
print(f"Errors in parallel:   {sum(1 for e in result_par.calculationError if e is not None)}")

TypeError: No matching overloads found for neqsim.thermodynamicoperations.ThermodynamicOperations.propertyFlash(list,list,int,NoneType,NoneType), options are:
	public neqsim.api.ioc.CalculationResult neqsim.thermodynamicoperations.ThermodynamicOperations.propertyFlash(java.util.List,java.util.List,int,java.util.List,java.util.List)


## 5. Validate Results Match

Let's verify that parallel execution produces identical results to sequential execution.

In [None]:
# Compare results between sequential and parallel execution
max_diff = 0.0
num_properties = len(result_seq.fluidProperties[0]) if result_seq.fluidProperties[0] else 0

for i in range(num_points):
    if result_seq.calculationError[i] is None and result_par.calculationError[i] is None:
        for j in range(num_properties):
            seq_val = result_seq.fluidProperties[i][j]
            par_val = result_par.fluidProperties[i][j]
            if seq_val is not None and par_val is not None:
                diff = abs(seq_val - par_val)
                if diff > max_diff:
                    max_diff = diff

print(f"Maximum difference between sequential and parallel results: {max_diff:.2e}")
if max_diff < 1e-10:
    print("✓ Results are identical!")
else:
    print("⚠ Results differ (check for numerical precision issues)")

## 6. Extract and Visualize Results

Convert Java results to Python and create visualizations of the calculated properties.

In [None]:
# Get property names from the system
property_names = list(gas.getProperties().getNames())

# Convert results to pandas DataFrame
data = []
for i in range(num_points):
    if result_par.calculationError[i] is None:
        row = {
            'Pressure_bar': pressures_np[i],
            'Temperature_K': temperatures_np[i]
        }
        # Add all properties (first few for display)
        for j, name in enumerate(property_names[:10]):  # First 10 properties
            if result_par.fluidProperties[i][j] is not None:
                row[name] = float(result_par.fluidProperties[i][j])
        data.append(row)

df = pd.DataFrame(data)
print("Sample of calculated properties:")
df.head(10)

In [None]:
# Visualize results
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# Plot density vs temperature
if 'Density' in df.columns:
    axes[0].scatter(df['Temperature_K'], df['Density'], c=df['Pressure_bar'], cmap='viridis')
    axes[0].set_xlabel('Temperature (K)')
    axes[0].set_ylabel('Density')
    axes[0].set_title('Density vs Temperature')
    axes[0].colorbar = plt.colorbar(axes[0].collections[0], ax=axes[0], label='Pressure (bar)')

# Plot compressibility factor vs pressure
if 'Z' in df.columns:
    axes[1].scatter(df['Pressure_bar'], df['Z'], c=df['Temperature_K'], cmap='coolwarm')
    axes[1].set_xlabel('Pressure (bar)')
    axes[1].set_ylabel('Compressibility Factor (Z)')
    axes[1].set_title('Z-factor vs Pressure')
    plt.colorbar(axes[1].collections[0], ax=axes[1], label='Temperature (K)')

plt.tight_layout()
plt.show()

## 7. Scaling with Different Thread Counts

Let's measure how performance scales with different numbers of threads.

In [None]:
# Test with different thread counts
import os
max_threads = os.cpu_count() or 4
thread_counts = [1, 2, 4, 8, max_threads] if max_threads >= 8 else [1, 2, 4, max_threads]
thread_counts = [t for t in thread_counts if t <= max_threads]

times = []
for num_threads in thread_counts:
    # Run multiple times for averaging
    run_times = []
    for _ in range(3):
        start = time.time()
        result = ops.propertyFlashParallel(pressures, temperatures, 1, None, None, num_threads)
        run_times.append(time.time() - start)
    avg_time = np.mean(run_times)
    times.append(avg_time)
    print(f"Threads: {num_threads:2d} | Time: {avg_time*1000:8.2f} ms | Speedup: {times[0]/avg_time:.2f}x")

# Plot scaling
plt.figure(figsize=(8, 5))
plt.plot(thread_counts, [times[0]/t for t in times], 'bo-', label='Actual speedup')
plt.plot(thread_counts, thread_counts, 'r--', alpha=0.5, label='Linear (ideal) speedup')
plt.xlabel('Number of Threads')
plt.ylabel('Speedup (x)')
plt.title('Parallel Scaling Performance')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

## 8. PH Flash (Pressure-Enthalpy) Example

For process simulation, PH flash is commonly used when enthalpy is known (e.g., after a heat exchanger or valve).

In [None]:
# First, calculate enthalpies at known PT conditions
num_ph_points = 50
ph_pressures = []
enthalpies = []

for i in range(num_ph_points):
    P = 20.0 + i * 1.5  # 20 to 95 bar
    T = 280.0 + i * 1.5  # 280 to 355 K
    
    # Clone the system and calculate enthalpy at this PT
    temp_fluid = gas.clone()
    temp_fluid.setPressure(P)
    temp_fluid.setTemperature(T)
    temp_ops = jneqsim.thermodynamicoperations.ThermodynamicOperations(temp_fluid)
    temp_ops.TPflash()
    temp_fluid.init(2)
    
    ph_pressures.append(P)
    enthalpies.append(temp_fluid.getEnthalpy("J/mol"))

# Run parallel PH flash (mode 2)
start = time.time()
ph_result = ops.propertyFlashParallel(ph_pressures, enthalpies, 2, None, None)
ph_time = time.time() - start

print(f"PH Flash: {num_ph_points} points in {ph_time*1000:.2f} ms")
print(f"Errors: {sum(1 for e in ph_result.calculationError if e is not None)}")

## 9. Varying Compositions (Online Fractions)

In real-time monitoring scenarios, the fluid composition may vary for each data point. The parallel flash supports this via the `onlineFractions` parameter.

In [None]:
# Create a simpler fluid for varying composition example
var_gas = jneqsim.thermo.system.SystemSrkEos(273.15 + 25.0, 50.0)
var_gas.addComponent("methane", 0.85)
var_gas.addComponent("ethane", 0.10)
var_gas.addComponent("propane", 0.05)
var_gas.setMixingRule("classic")
var_ops = jneqsim.thermodynamicoperations.ThermodynamicOperations(var_gas)

# Generate varying compositions
num_var_points = 30
var_pressures = list(np.linspace(20.0, 80.0, num_var_points))
var_temps = list(np.linspace(280.0, 340.0, num_var_points))

# Component names (must match the fluid)
components = ["methane", "ethane", "propane"]

# Create varying fractions - simulate composition changes over time
methane_fracs = []
ethane_fracs = []
propane_fracs = []

for i in range(num_var_points):
    # Methane: 80% to 90%
    methane = 0.80 + 0.01 * i / 3
    # Ethane: 12% to 8%
    ethane = 0.12 - 0.004 * i / 3
    # Propane: balance to 100%
    propane = 1.0 - methane - ethane
    
    methane_fracs.append(methane)
    ethane_fracs.append(ethane)
    propane_fracs.append(propane)

# Create the onlineFractions list (list of lists)
onlineFractions = [methane_fracs, ethane_fracs, propane_fracs]

# Run parallel flash with varying compositions
start = time.time()
var_result = var_ops.propertyFlashParallel(var_pressures, var_temps, 1, components, onlineFractions)
var_time = time.time() - start

print(f"Varying composition flash: {num_var_points} points in {var_time*1000:.2f} ms")
print(f"Errors: {sum(1 for e in var_result.calculationError if e is not None)}")

## 10. Batch Processing for Large Datasets

For very large datasets (thousands of points), use `propertyFlashBatch` to process in chunks, reducing peak memory usage.

In [None]:
# Large dataset example
large_num_points = 500
large_pressures = list(np.random.uniform(10.0, 100.0, large_num_points))
large_temps = list(np.random.uniform(250.0, 350.0, large_num_points))

# Compare parallel vs batch processing
batch_sizes = [50, 100, 200]

print(f"Processing {large_num_points} points:\n")

# Full parallel
start = time.time()
result_full = ops.propertyFlashParallel(large_pressures, large_temps, 1, None, None)
time_full = time.time() - start
print(f"Full parallel:      {time_full*1000:8.2f} ms")

# Batch processing with different sizes
for batch_size in batch_sizes:
    start = time.time()
    result_batch = ops.propertyFlashBatch(large_pressures, large_temps, 1, None, None, batch_size)
    time_batch = time.time() - start
    print(f"Batch (size={batch_size:3d}):  {time_batch*1000:8.2f} ms")

## 11. Unit Tests

Here we implement test functions to validate the parallel flash implementation.

In [None]:
def test_parallel_matches_sequential():
    """Test that parallel results match sequential results."""
    test_gas = jneqsim.thermo.system.SystemSrkEos(273.15, 50.0)
    test_gas.addComponent("methane", 0.9)
    test_gas.addComponent("ethane", 0.1)
    test_gas.setMixingRule("classic")
    test_ops = jneqsim.thermodynamicoperations.ThermodynamicOperations(test_gas)
    
    p = list(np.linspace(10, 50, 20))
    t = list(np.linspace(280, 320, 20))
    
    seq = test_ops.propertyFlash(p, t, 1, None, None)
    par = test_ops.propertyFlashParallel(p, t, 1, None, None)
    
    for i in range(20):
        assert seq.calculationError[i] is None, f"Sequential error at {i}"
        assert par.calculationError[i] is None, f"Parallel error at {i}"
        for j in range(len(seq.fluidProperties[i])):
            if seq.fluidProperties[i][j] and par.fluidProperties[i][j]:
                diff = abs(seq.fluidProperties[i][j] - par.fluidProperties[i][j])
                assert diff < 1e-10, f"Mismatch at point {i}, property {j}: {diff}"
    
    print("✓ test_parallel_matches_sequential PASSED")

def test_invalid_flash_mode():
    """Test error handling for invalid flash mode."""
    test_gas = jneqsim.thermo.system.SystemSrkEos(273.15, 50.0)
    test_gas.addComponent("methane", 1.0)
    test_gas.setMixingRule("classic")
    test_ops = jneqsim.thermodynamicoperations.ThermodynamicOperations(test_gas)
    
    p = [10.0, 20.0]
    t = [300.0, 310.0]
    
    result = test_ops.propertyFlashParallel(p, t, 99, None, None)
    
    for i in range(2):
        assert result.calculationError[i] is not None, "Should have error for invalid mode"
        assert "FlashMode" in result.calculationError[i], "Error should mention FlashMode"
    
    print("✓ test_invalid_flash_mode PASSED")

def test_nan_handling():
    """Test that NaN inputs are handled gracefully."""
    test_gas = jneqsim.thermo.system.SystemSrkEos(273.15, 50.0)
    test_gas.addComponent("methane", 1.0)
    test_gas.setMixingRule("classic")
    test_ops = jneqsim.thermodynamicoperations.ThermodynamicOperations(test_gas)
    
    p = [10.0, float('nan'), 30.0]
    t = [300.0, 310.0, 320.0]
    
    result = test_ops.propertyFlashParallel(p, t, 1, None, None)
    
    assert result.calculationError[0] is None, "First point should succeed"
    assert result.calculationError[1] is not None, "Second point should have error"
    assert result.calculationError[2] is None, "Third point should succeed"
    
    print("✓ test_nan_handling PASSED")

# Run all tests
print("Running unit tests...\n")
test_parallel_matches_sequential()
test_invalid_flash_mode()
test_nan_handling()
print("\n✓ All tests passed!")

## 12. API Reference and Documentation

### Method Signatures

| Method | Description |
|--------|-------------|
| `propertyFlash(Spec1, Spec2, FlashMode, components, onlineFractions)` | Sequential execution |
| `propertyFlashParallel(Spec1, Spec2, FlashMode, components, onlineFractions)` | Parallel using all CPU cores |
| `propertyFlashParallel(..., numThreads)` | Parallel with specified thread count |
| `propertyFlashBatch(..., batchSize)` | Batch processing for large datasets |

### Flash Modes

| Mode | Type | Spec2 Unit |
|------|------|------------|
| 1 | PT (Pressure-Temperature) | Temperature in K |
| 2 | PH (Pressure-Enthalpy) | Enthalpy in J/mol |
| 3 | PS (Pressure-Entropy) | Entropy in J/(mol·K) |

### Property Indices

| Index | Property | Units |
|-------|----------|-------|
| 0 | Number of Phases | - |
| 1 | Pressure | Pa |
| 2 | Temperature | K |
| 3 | Mole Percent | % |
| 4 | Weight Percent | % |
| 5 | Molar Volume | m³/mol |
| 6 | Volume Percent | % |
| 7 | Density | kg/m³ |
| 8 | Z Factor | - |
| 9 | Molecular Weight | g/mol |
| 10 | Enthalpy | J/mol |
| 11 | Entropy | J/(mol·K) |
| 12 | Heat Capacity Cp | J/(mol·K) |
| 13 | Heat Capacity Cv | J/(mol·K) |
| 14 | Kappa (Cp/Cv) | - |
| 15 | JT Coefficient | K/Pa (may be NaN) |
| 16 | Sound Velocity | m/s (may be NaN) |
| 17 | Viscosity | Pa·s |
| 18 | Thermal Conductivity | W/(m·K) |
| 19+ | Phase-specific properties | (gas, oil, aqueous) |

### CalculationResult Object

```python
result.fluidProperties  # Double[numPoints][numProperties] - calculated values
result.calculationError # String[numPoints] - None if success, error message if failed
```

### Best Practices

1. **Use Python lists** - Pass Python lists directly, no need for Java ArrayList
2. **Use parallel for > 10 points** - Overhead makes parallel slower for very few points
3. **Use batch processing** for > 1000 points to manage memory
4. **Handle errors per point** - One failed calculation doesn't affect others
5. **Warm up JVM** for benchmarking - First calls are slower due to JIT compilation

## Summary

This notebook demonstrated:

1. **Direct Java access** using `jneqsim` for full control over NeqSim
2. **Parallel flash calculations** that leverage multiple CPU cores
3. **Performance improvements** of 2-10x depending on hardware
4. **Different flash types** (PT, PH, PS) all supporting parallel execution
5. **Varying compositions** for real-time monitoring scenarios
6. **Batch processing** for memory-efficient handling of large datasets
7. **Robust error handling** where individual point failures don't affect others

For more information, see:
- [NeqSim Documentation](https://equinor.github.io/neqsimhome/)
- [NeqSim Python Wiki](https://github.com/equinor/neqsim-python/wiki)
- [NeqSim Java API](https://github.com/equinor/neqsim)