# HPCSeries Core v0.8 - Getting Started

Welcome to HPCSeries Core! This notebook provides a quick introduction to using the high-performance statistical computing library.

## What is HPCSeries Core?

HPCSeries Core is a high-performance statistical computing library that provides:
- **SIMD-accelerated operations** for maximum performance
- **Automatic parallelization** with OpenMP
- **C/Fortran backend** for computationally intensive operations
- **Python API** for ease of use
- **Auto-tuning calibration** to optimize for your hardware

## What's New in v0.8.0?

**Exponential Weighted Statistics**:
- `ewma()` - Exponentially weighted moving average
- `ewvar()` - Exponentially weighted variance
- `ewstd()` - Exponentially weighted standard deviation

**Time Series Transforms**:
- `diff()` - Time series differencing
- `cumulative_min()`, `cumulative_max()` - Running extrema
- `convolve_valid()` - FIR filtering

**Advanced Robust Statistics**:
- `trimmed_mean()` - Outlier-resistant mean
- `winsorized_mean()` - Clamped mean

## Installation

```bash
# Install in development mode
pip install -e .
```

## Quick Start

In [1]:
import hpcs
import numpy as np
import time

# Display version and CPU info
print(f"HPCSeries Core version: {hpcs.__version__}")
print("\nRunning CPU detection...")
!hpcs cpuinfo

HPCSeries Core version: 0.8.0

Running CPU detection...
=== CPU Information ===

CPU Vendor:          AuthenticAMD
Physical Cores:      4
Logical Cores:       8
Optimal Threads:     2

Cache Hierarchy:
  L1:      32 KB
  L2:     512 KB
  L3:    4096 KB

NUMA Topology:
  Nodes:               1
  Cores per node:      4

SIMD Capabilities:
  Active ISA:          AVX2
  Vector width:        256-bit (4 doubles)
  SSE2:                ✓
  AVX:                 ✓
  AVX2:                ✓
  AVX-512:             ✗
  NEON:                ✗
  FMA3:                ✓


## 1. Basic Statistical Operations

HPCSeries provides fast implementations of common statistical functions:

In [2]:
# Create sample data
data = np.random.randn(1_000_000)

print("Basic Statistics:")
print(f"Sum:      {hpcs.sum(data):.6f}")
print(f"Mean:     {hpcs.mean(data):.6f}")
print(f"Std Dev:  {hpcs.std(data):.6f}")
print(f"Variance: {hpcs.var(data):.6f}")
print(f"Min:      {hpcs.min(data):.6f}")
print(f"Max:      {hpcs.max(data):.6f}")

Basic Statistics:
Sum:      -692.499148
Mean:     -0.000692
Std Dev:  0.999580
Variance: 0.999161
Min:      -4.660955
Max:      4.935668


## 2. Robust Statistics

Robust statistics are less sensitive to outliers:

In [3]:
# Add some outliers
data_with_outliers = np.concatenate([data, [100, -100, 200, -200]])

print("Comparison: Regular vs Robust Statistics")
print(f"\nMean:          {hpcs.mean(data_with_outliers):.6f}")
print(f"Median:        {hpcs.median(data_with_outliers):.6f}")
print(f"\nStd Dev:       {hpcs.std(data_with_outliers):.6f}")
print(f"MAD (robust):  {hpcs.mad(data_with_outliers):.6f}")

Comparison: Regular vs Robust Statistics

Mean:          -0.000692
Median:        -0.000735

Std Dev:       1.048407
MAD (robust):  0.675018


## 3. Rolling Window Operations

Compute statistics over sliding windows (C-accelerated rolling operations):

In [4]:
# Create time series data
time_series = np.cumsum(np.random.randn(10000))
window = 50

# Rolling statistics
rolling_mean = hpcs.rolling_mean(time_series, window)
rolling_std = hpcs.rolling_std(time_series, window)
rolling_median = hpcs.rolling_median(time_series, window)

print(f"Computed rolling statistics with window={window}")
print(f"Output shape: {rolling_mean.shape}")
print(f"First {window-1} values are NaN (insufficient data)")
print(f"Valid values start at index {window-1}")

Computed rolling statistics with window=50
Output shape: (10000,)
First 49 values are NaN (insufficient data)
Valid values start at index 49


## 4. Z-Score Normalization (NEW in v0.7)

Fast C-accelerated z-score computation:

In [5]:
# Rolling z-score (C-optimized)
rolling_zscore = hpcs.rolling_zscore(time_series, window)

# Robust z-score using MAD
rolling_robust_zscore = hpcs.rolling_robust_zscore(time_series, window)

print(f"Rolling z-score computed (window={window})")
print(f"Mean of z-scores: {np.nanmean(rolling_zscore):.6f} (should be ~0)")
print(f"Std of z-scores:  {np.nanstd(rolling_zscore):.6f} (should be ~1)")

Rolling z-score computed (window=50)
Mean of z-scores: 0.008271 (should be ~0)
Std of z-scores:  1.377210 (should be ~1)


## 5. Performance Comparison

Let's compare HPCSeries with NumPy:

In [56]:
# Large dataset for meaningful benchmarks
large_data = np.random.randn(10_000_000)

# Benchmark sum
start = time.perf_counter()
result_hpcs = hpcs.sum(large_data)
time_hpcs = time.perf_counter() - start

start = time.perf_counter()
result_numpy = np.sum(large_data)
time_numpy = time.perf_counter() - start

print(f"Sum of {len(large_data):,} elements:")
print(f"  HPCSeries: {time_hpcs*1000:.3f} ms")
print(f"  NumPy:     {time_numpy*1000:.3f} ms")
print(f"  Speedup:   {time_numpy/time_hpcs:.2f}x")
print(f"  Results match: {np.allclose(result_hpcs, result_numpy)}")

Sum of 10,000,000 elements:
  HPCSeries: 5.713 ms
  NumPy:     7.058 ms
  Speedup:   1.24x
  Results match: True


## 6. Auto-Tuning Calibration (NEW in v0.7)

HPCSeries can automatically calibrate to your hardware for optimal performance:

In [57]:
# Run quick calibration (5-10 seconds)
print("Running quick calibration...")
hpcs.calibrate(quick=True)
print("\nCalibration complete!")

# Save configuration
import os
config_path = os.path.expanduser("~/.hpcs/config.json")
hpcs.save_calibration_config(config_path)
print(f"Configuration saved to: {config_path}")

Running quick calibration...

Calibration complete!
Configuration saved to: /root/.hpcs/config.json


=== HPCS Quick Calibration ===
Running quick validation benchmark...
  Sum (1M elements): 1004.00 µs
Quick calibration complete (using hardware defaults).
Run hpcs_calibrate() for full optimization.
[Config] Configuration saved to: /root/.hpcs/config.json


## 7. CLI Commands

HPCSeries also provides a command-line interface:

In [58]:
# Show version
!hpcs version

# Show CPU information
!hpcs cpuinfo

# Run quick calibration
!hpcs calibrate --quick

# Run benchmarks
!hpcs bench --size 1000000 --iterations 5

HPCSeries Core v0.8.0
High-Performance Computing Series - Optimized Statistical Kernels

Features:
  • SIMD vectorization (AVX-512, AVX2, AVX, SSE2, NEON)
  • OpenMP parallelization with NUMA awareness
  • Adaptive auto-tuning (v0.5)
  • Fortran/C/Python unified API
=== CPU Information ===

CPU Vendor:          AuthenticAMD
Physical Cores:      4
Logical Cores:       8
Optimal Threads:     2

Cache Hierarchy:
  L1:      32 KB
  L2:     512 KB
  L3:    4096 KB

NUMA Topology:
  Nodes:               1
  Cores per node:      4

SIMD Capabilities:
  Active ISA:          AVX2
  Vector width:        256-bit (4 doubles)
  SSE2:                ✓
  AVX:                 ✓
  AVX2:                ✓
  AVX-512:             ✗
  NEON:                ✗
  FMA3:                ✓
=== HPCSeries Auto-Tuning Calibration ===

This will benchmark optimal parallelization thresholds for your system.

Mode: quick calibration
Estimated time: 5-10 seconds

Running benchmarks...
=== HPCS Quick Calibration ===
Runnin

## Next Steps

Explore more advanced features:

1. **01_rolling_mean_vs_median.ipynb** - Rolling window operations
2. **02_robust_anomaly_climate.ipynb** - Anomaly detection with robust statistics
3. **03_batched_iot_rolling.ipynb** - Batched processing for IoT data
4. **04_axis_reductions_column_stats.ipynb** - 2D array operations
5. **05_masked_missing_data.ipynb** - Handling missing data
6. **06_performance_calibration.ipynb** - Performance tuning and optimization
7. **07_c_optimized_operations.ipynb** - C-accelerated operations (v0.7)
8. **08_numpy_pandas_migration_guide.ipynb** - Migration from NumPy/Pandas
9. **09_real_world_applications.ipynb** - Production use cases
10. **10_exponential_weighted_statistics.ipynb** - Deep dive into EWMA/EWVAR/EWSTD (NEW in v0.8)

## Documentation

- **GitHub**: [HPCSeries Core Repository](https://github.com/yourusername/HPCSeriesCore)
- **API Reference**: See `docs/` directory
- **Specifications**: See root directory for detailed specs

## Support

For issues, questions, or contributions, please visit the GitHub repository.

In [59]:
# FIR filtering with convolution
signal = np.random.randn(1000)
# 3-point moving average filter
kernel = np.array([1/3, 1/3, 1/3])
filtered = hpcs.convolve_valid(signal, kernel)

print(f"FIR Filtering with convolve_valid():")
print(f"  Input signal:  {len(signal)} points")
print(f"  Filter kernel: {len(kernel)} points")
print(f"  Output:        {len(filtered)} points (no padding)")
print(f"  Formula: n - m + 1 = {len(signal)} - {len(kernel)} + 1 = {len(filtered)}")

FIR Filtering with convolve_valid():
  Input signal:  1000 points
  Filter kernel: 3 points
  Output:        998 points (no padding)
  Formula: n - m + 1 = 1000 - 3 + 1 = 998


In [60]:
# Differencing - useful for making time series stationary
stock_prices = np.array([100, 102, 101, 105, 103, 107, 110])
price_changes = hpcs.diff(stock_prices, order=1)

print("Stock Price Differencing:")
print(f"  Prices:  {stock_prices}")
print(f"  Changes: {price_changes}")
print(f"  Note: First value is NaN (no previous value to difference)")

# Cumulative extrema - track running min/max
values = np.array([5, 3, 7, 2, 8, 1, 9])
cum_min = hpcs.cumulative_min(values)
cum_max = hpcs.cumulative_max(values)

print(f"\nCumulative Extrema:")
print(f"  Values:     {values}")
print(f"  Cum Min:    {cum_min}")
print(f"  Cum Max:    {cum_max}")

Stock Price Differencing:
  Prices:  [100 102 101 105 103 107 110]
  Changes: [nan  2. -1.  4. -2.  4.  3.]
  Note: First value is NaN (no previous value to difference)

Cumulative Extrema:
  Values:     [5 3 7 2 8 1 9]
  Cum Min:    [5. 3. 3. 2. 2. 1. 1.]
  Cum Max:    [5. 5. 7. 7. 8. 8. 9.]


## 9. Time Series Transforms (NEW in v0.8)

Transform time series data with differencing, cumulative operations, and filtering:

In [63]:
# Exponential weighted variance and std dev
ewvar_result = hpcs.ewvar(ts_data, alpha)
ewstd_result = hpcs.ewstd(ts_data, alpha)

print(f"Exponential Weighted Variance and Std Dev:")
print(f"  EWVAR: {ewvar_result[-1]:.4f}")
print(f"  EWSTD: {ewstd_result[-1]:.4f}")
print(f"\nComparison with pandas.ewm():")
print(f"  HPCSeries uses adjust=False (default pandas behavior)")
print(f"  Matches: pandas.Series(data).ewm(alpha={alpha}, adjust=False).mean()")

NameError: name 'alpha' is not defined

In [64]:
# Create a time series with a trend
np.random.seed(42)
trend = np.linspace(0, 10, 1000)
noise = np.random.randn(1000) * 0.5
ts_data = trend + noise

# Compare different smoothing approaches
alpha = 0.1  # Smoothing factor (0.1 = slow/smooth, 0.9 = fast/responsive)

# EWMA - tracks trend with exponential weighting
ewma_result = hpcs.ewma(ts_data, alpha)

# Standard rolling mean for comparison
rolling_mean_result = hpcs.rolling_mean(ts_data, window=50)

print(f"Exponential Weighted Moving Average (alpha={alpha}):")
print(f"  Input length:  {len(ts_data)}")
print(f"  Output length: {len(ewma_result)}")
print(f"  First value:   {ewma_result[0]:.4f} (initialized to first data point)")
print(f"  Last value:    {ewma_result[-1]:.4f}")
print(f"\nEWMA vs Rolling Mean:")
print(f"  EWMA:         No NaN values (uses all history)")
print(f"  Rolling Mean: First {49} values are NaN")

Exponential Weighted Moving Average (alpha=0.1):
  Input length:  1000
  Output length: 1000
  First value:   0.2484 (initialized to first data point)
  Last value:    9.8936

EWMA vs Rolling Mean:
  EWMA:         No NaN values (uses all history)
  Rolling Mean: First 49 values are NaN


## 8. Exponential Weighted Statistics (NEW in v0.8)

Exponential weighted statistics give more weight to recent observations, making them ideal for tracking trends in time series data:

## Next Steps

Explore more advanced features:

1. **01_rolling_mean_vs_median.ipynb** - Rolling window operations
2. **02_robust_anomaly_climate.ipynb** - Anomaly detection with robust statistics
3. **03_batched_iot_rolling.ipynb** - Batched processing for IoT data
4. **04_axis_reductions_column_stats.ipynb** - 2D array operations
5. **05_masked_missing_data.ipynb** - Handling missing data
6. **06_performance_calibration.ipynb** - Performance tuning and optimization
7. **07_c_optimized_operations.ipynb** - C-accelerated operations (v0.7)

## Documentation

- **GitHub**: [HPCSeries Core Repository](https://github.com/yourusername/HPCSeriesCore)
- **API Reference**: See `docs/` directory
- **Specifications**: See root directory for detailed specs

## Support

For issues, questions, or contributions, please visit the GitHub repository.