# 📊 Performance Comparison of `calculate_sip` Implementations

## 🧠 Objective

Evaluate and compare the **correctness** and **performance** of implementation for same calculation using scalar-like (with `math` and `numpy`) and array-like (with `numpy`). The example calculation is computations of the subionospheric point (SIP) for GNSS ray paths. There are four setups:

* scalar using `math` trigonometric funtions 
* scalar using `numpy` trigonometric funtions
* manual vectorization using `numpy` 
* vectorization with `numpy.vectorize` decorator

---

## ✅ Functional Equivalence Test

* **Purpose**: Ensure the **NumPy vectorized version** produces identical results to the original scalar version.
* **Method**:

  * Generate 10000 random test cases.
  * Compare outputs of `calculate_sip_scalar_math()` and `calculate_sip_vectorized()` using `np.allclose()`.
* **Result**: ✅ All tests passed within a tolerance of `1e-10`.

---

## 🚀 Performance Benchmark

### 🧪 Test Setup

* Random input vectors of size `N = 10,000`.
* Timing measured using Python’s `timeit` module (3 runs each).
* Implementations tested:

  1. `calculate_sip_scalar_math()` in a Python `for` loop
  2. `calculate_sip_scalar_numpy()` in a Python `for` loop
  3. `calculate_sip_scalar_math()` wrapped with `np.vectorize()`
  4. Fully vectorized version using NumPy (`calculate_sip_vectorized()`)

### ⏱️ Results (see last cell)

```
Scalar-math function time (3 runs):        0.0529 seconds
Scalar-numpy function time (3 runs):       0.2338 seconds
Vectorized function time (3 runs):         0.0037 seconds
Naive vectorized function time (3 runs):   0.0331 seconds

Speedup numpy-Scalar vs math-Scalar:      ~0.23x           
Speedup Vectorized vs math-Scalar:       ~14.25x
Speedup Naive Vectorized vs Scalar:       ~1.60x
```

---

## 🔍 Insights

* ✅ `np.vectorize()` offers **API convenience**, but no real performance boost — it just loops internally.
* ⚠️ The **true NumPy vectorized implementation** significantly outperforms both scalar and `np.vectorize()` versions.
* ✅ NumPy's vectorized math functions leverage native optimizations, making them ideal for large array-based geospatial calculations.
* ⚠️ numpy is slower than math for scalars

---

## 📌 Conclusion

Implement vectorization manually. Avoid `for` and `while` loops in the code for hard calculation: replace them with `numpy`. Don't use `numpy` triginometric for pure scallar calculations, use `math`: it is 4x time faster.


## SIP calculations 

In [45]:
import numpy as np
import math
RE_meters = 6371000

def calculate_sip_scalar_math(s_lat, s_lon, az, el, hm = 350000, R=RE_meters):
    """
    Calculates subionospheric point and delatas from site
    Parameters:
        s_lat, slon - site latitude and longitude in radians
        hm - ionposheric maximum height (meters)
        az, el - azimuth and elevation of the site-sattelite line of sight in
            radians
        R - Earth radius (meters)
    """
    psi = math.pi / 2 - el - math.asin(math.cos(el) * R / (R + hm))
    lat = bi = math.asin(math.sin(s_lat) * math.cos(psi) + math.cos(s_lat) * math.sin(psi) * math.cos(az))
    lon = sli = s_lon + math.asin(math.sin(psi) * math.sin(az) / math.cos(bi))

    lon = lon - 2 * math.pi if lon > math.pi else lon
    lon = lon + 2 * math.pi if lon < -math.pi else lon
    return lat, lon

def calculate_sip_scalar_numpy(s_lat, s_lon, az, el, hm = 350000, R=RE_meters):
    """Same as calculate_sip_scalar_math but math is replaced with numpy
    """
    psi = np.pi / 2 - el - np.arcsin(np.cos(el) * R / (R + hm))
    lat = bi = np.arcsin(np.sin(s_lat) * np.cos(psi) + np.cos(s_lat) * np.sin(psi) * np.cos(az))
    lon = sli = s_lon + np.arcsin(np.sin(psi) * np.sin(az) / np.cos(bi))

    lon = lon - 2 * np.pi if lon > np.pi else lon
    lon = lon + 2 * np.pi if lon < -np.pi else lon
    return lat, lon


def calculate_sip_vectorized(s_lat, s_lon, az, el, R=6371000, hm=350000):
    """
    Calculates subionospheric point and delatas from site
    Parameters:
        s_lat, slon - site latitude and longitude in radians
        hm - ionposheric maximum height (meters)
        az, el - azimuth and elevation of the site-sattelite line of sight in
            radians
        R - Earth radius (meters)
    """
    psi = np.pi / 2 - el - np.arcsin(np.cos(el) * R / (R + hm))
    lat = np.arcsin(np.sin(s_lat) * np.cos(psi) + np.cos(s_lat) * np.sin(psi) * np.cos(az))
    lon = s_lon + np.arcsin(np.sin(psi) * np.sin(az) / np.cos(lat))

    # Normalize longitude to [-pi, pi]
    lon = np.where(lon > np.pi, lon - 2 * np.pi, lon)
    lon = np.where(lon < -np.pi, lon + 2 * np.pi, lon)

    return lat, lon



## Correctness test

In [46]:
# Test function
def test_calculate_sip_equivalence():
    np.random.seed(42)

    # Generate test cases
    N = 10000
    s_lat = np.random.uniform(-np.pi / 2, np.pi / 2, N)
    s_lon = np.random.uniform(-np.pi, np.pi, N)
    az = np.random.uniform(0, 2 * np.pi, N)
    el = np.random.uniform(0.01, np.pi / 2 - 0.01, N)  # avoid el=0 or exactly pi/2

    for i in range(N):
        lat_sm, lon_sm = calculate_sip_scalar_math(s_lat[i], s_lon[i], az[i], el[i])
        lat_sn, lon_sn = calculate_sip_scalar_numpy(s_lat[i], s_lon[i], az[i], el[i])
        lat_v, lon_v = calculate_sip_vectorized(s_lat[i], s_lon[i], az[i], el[i])

        assert np.allclose(lat_sm, lat_v, atol=1e-10), f"Latitude mismatch at index {i}"
        assert np.allclose(lon_sm, lon_v, atol=1e-10), f"Longitude mismatch at index {i}"

    print("All tests passed!")

# Run the test
test_calculate_sip_equivalence()

All tests passed!


## Performance test

In [47]:
# Test data generator
import timeit

def generate_data(N):
    np.random.seed(0)
    s_lat = np.random.uniform(-np.pi / 2, np.pi / 2, N)
    s_lon = np.random.uniform(-np.pi, np.pi, N)
    az = np.random.uniform(0, 2 * np.pi, N)
    el = np.random.uniform(0.01, np.pi / 2 - 0.01, N)
    return s_lat, s_lon, az, el

# Wrapper functions for timing
def run_scalar_math():
    s_lat, s_lon, az, el = generate_data(N)
    lats, lons = [], []
    for i in range(N):
        sip_lat, sip_lon = calculate_sip_scalar_math(
            s_lat[i], s_lon[i], az[i], el[i]
        )
        lats.append(sip_lat)
        lons.append(sip_lon)

def run_scalar_numpy():
    s_lat, s_lon, az, el = generate_data(N)
    lats, lons = [], []
    for i in range(N):
        sip_lat, sip_lon = calculate_sip_scalar_numpy(
            s_lat[i], s_lon[i], az[i], el[i]
        )
        lats.append(sip_lat)
        lons.append(sip_lon)

def run_vectorized():
    calculate_sip_vectorized(*generate_data(N))

def run_naive_vectorized():
    naive_vectorized_scalar_sip = np.vectorize(calculate_sip)
    naive_vectorized_scalar_sip(*generate_data(N))



In [48]:
# Measure performance
N = 10000  # Adjust size based on desired test load
scalar_time_math = timeit.timeit(run_scalar_math, number=3)
scalar_time_numpy = timeit.timeit(run_scalar_numpy, number=3)
vectorized_time = timeit.timeit(run_vectorized, number=3)
naive_vectorized_time = timeit.timeit(run_naive_vectorized, number=3)

print(f"Scalar-math function time (3 runs):     {scalar_time_math:.4f} seconds")
print(f"Scalar-numpy function time (3 runs):     {scalar_time_numpy:.4f} seconds")
print(f"Vectorized function time (3 runs): {vectorized_time:.4f} seconds")
print(f"Naive vectorized function time (3 runs): {naive_vectorized_time:.4f} seconds")
print()
print(f"Speedup numpy-Scalar vs math-Scalar: ~{scalar_time_math / scalar_time_numpy:.2f}x")
print(f"Speedup Vectorized vs Scalar: ~{scalar_time_math / vectorized_time:.2f}x")
print(f"Speedup Naive Vectorized vs Scalar: ~{scalar_time_math / naive_vectorized_time:.2f}x")

Scalar-math function time (3 runs):     0.0440 seconds
Scalar-numpy function time (3 runs):     0.2264 seconds
Vectorized function time (3 runs): 0.0036 seconds
Naive vectorized function time (3 runs): 0.0323 seconds

Speedup numpy-Scalar vs math-Scalar: ~0.19x
Speedup Vectorized vs Scalar: ~12.38x
Speedup Naive Vectorized vs Scalar: ~1.36x
