# SciPy Introduction & Basics
- **SciPy**: Scientific Python library built on NumPy for advanced scientific computing
- **Purpose**: Provides tools for optimization, integration, interpolation, linear algebra, statistics, signal processing, and more
- **Foundation**: Extends NumPy with specialized algorithms and domain-specific functions

Key characteristics:
- Built on NumPy arrays
- Organized into subpackages by domain
- Wraps optimized C/Fortran libraries (BLAS, LAPACK, FFTPACK)
- Open-source and community-driven

## Installation & Setup
```bash
# Using pip
pip install scipy

# Using conda
conda install scipy

# Verify installation
python -c "import scipy; print(scipy.__version__)"
```

In [1]:
import numpy as np
import scipy

# Check versions
print("NumPy version:", np.__version__)
print("SciPy version:", scipy.__version__)

# SciPy requires NumPy
# Compatible versions work together seamlessly

NumPy version: 2.1.3
SciPy version: 1.15.3


## SciPy vs NumPy

| Feature | NumPy | SciPy |
|---------|-------|-------|
| Purpose | Array operations, basic math | Advanced scientific algorithms |
| Scope | Foundation library | Domain-specific tools |
| Examples | Array creation, slicing, basic stats | Optimization, integration, signal processing |
| Dependency | Standalone | Built on NumPy |
| Speed | Fast array operations | Optimized specialized algorithms |
| Use Case | Data manipulation | Scientific computing tasks |

**Relationship**: NumPy provides the data structure (ndarray), SciPy provides the algorithms

## SciPy Subpackages Organization

SciPy is organized into domain-specific subpackages:

| Subpackage | Domain | Key Functions |
|------------|--------|---------------|
| `scipy.cluster` | Clustering algorithms | K-means, hierarchical clustering |
| `scipy.constants` | Physical/mathematical constants | Speed of light, pi, Planck constant |
| `scipy.fft` | Fast Fourier Transforms | FFT, IFFT, frequency analysis |
| `scipy.integrate` | Integration & ODEs | Numerical integration, ODE solvers |
| `scipy.interpolate` | Interpolation | Splines, 1D/2D/ND interpolation |
| `scipy.io` | Input/Output | MATLAB files, data formats |
| `scipy.linalg` | Linear algebra | Matrix decomposition, solvers |
| `scipy.ndimage` | N-dimensional images | Image filtering, morphology |
| `scipy.odr` | Orthogonal distance regression | Curve fitting with errors |
| `scipy.optimize` | Optimization | Minimization, root finding |
| `scipy.signal` | Signal processing | Filters, convolution, wavelets |
| `scipy.sparse` | Sparse matrices | Sparse arrays, graph algorithms |
| `scipy.spatial` | Spatial algorithms | Distance, KD-trees, Voronoi |
| `scipy.special` | Special functions | Bessel, gamma, error functions |
| `scipy.stats` | Statistics | Distributions, hypothesis tests |

In [2]:
# Method 1: Import entire subpackage
from scipy import optimize
from scipy import linalg

# Method 2: Import specific functions
from scipy.optimize import minimize
from scipy.linalg import inv

# Method 3: Import with alias
import scipy.stats as stats
import scipy.signal as signal

print("Subpackages imported successfully")

# Convention: Import what you need, not entire scipy
# Lazy loading: Subpackages load only when first accessed

Subpackages imported successfully


In [3]:
# List all scipy subpackages
scipy_subpackages = [
    'cluster', 'constants', 'fft', 'integrate', 'interpolate',
    'io', 'linalg', 'ndimage', 'odr', 'optimize',
    'signal', 'sparse', 'spatial', 'special', 'stats'
]

print("Main SciPy subpackages:")
for pkg in scipy_subpackages:
    print(f"  - scipy.{pkg}")

# Each subpackage has its own documentation
# Access with: help(scipy.subpackage)

Main SciPy subpackages:
  - scipy.cluster
  - scipy.constants
  - scipy.fft
  - scipy.integrate
  - scipy.interpolate
  - scipy.io
  - scipy.linalg
  - scipy.ndimage
  - scipy.odr
  - scipy.optimize
  - scipy.signal
  - scipy.sparse
  - scipy.spatial
  - scipy.special
  - scipy.stats


### When to Use NumPy vs SciPy

**NumPy Examples**:
- Array creation: `np.array()`, `np.zeros()`, `np.arange()`
- Basic math: `np.mean()`, `np.sum()`, `np.sqrt()`
- Linear algebra basics: `np.dot()`, `np.transpose()`

**SciPy Examples**:
- Advanced linear algebra: `scipy.linalg.svd()`, `scipy.linalg.eig()`
- Optimization: `scipy.optimize.minimize()`
- Integration: `scipy.integrate.quad()`
- Statistical tests: `scipy.stats.ttest_ind()`

In [4]:
import numpy as np
from scipy import linalg

# Create a matrix
matrix = np.array([[1, 2], [3, 4]])

# NumPy: Basic determinant (uses LAPACK internally)
det_numpy = np.linalg.det(matrix)
print("NumPy determinant:", det_numpy)

# SciPy: Enhanced linear algebra (more options, better precision)
det_scipy = linalg.det(matrix)
print("SciPy determinant:", det_scipy)

# Both give same result for simple cases
# SciPy provides more control and advanced algorithms

NumPy determinant: -2.0000000000000004
SciPy determinant: -2.0


## SciPy Conventions & Best Practices

### 1. **Input/Output Conventions**
- Functions accept NumPy arrays (or array-like objects)
- Return NumPy arrays or specialized objects
- Preserve input data (no in-place modifications unless specified)

### 2. **Naming Conventions**
- Functions use lowercase with underscores: `minimize_scalar()`
- Classes use CamelCase: `OptimizeResult`
- Private functions prefixed with `_`

### 3. **Documentation Access**
- `help(function)`: Full documentation
- `function?`: IPython/Jupyter quick help
- `function??`: View source code

### 4. **Common Parameters**
- `axis`: Axis along which to operate
- `method`: Algorithm selection
- `tol`: Tolerance for convergence
- `maxiter`: Maximum iterations

In [5]:
from scipy import optimize

# Access documentation (uncomment to run)
# help(optimize.minimize)

# In Jupyter/IPython:
# optimize.minimize?

# List all functions in a subpackage
optimize_functions = [func for func in dir(optimize) if not func.startswith('_')]
print(f"First 10 functions in scipy.optimize:")
print(optimize_functions[:10])

First 10 functions in scipy.optimize:
['BFGS', 'Bounds', 'BroydenFirst', 'HessianUpdateStrategy', 'InverseJacobian', 'KrylovJacobian', 'LbfgsInvHessProduct', 'LinearConstraint', 'NoConvergence', 'NonlinearConstraint']


### Result Objects

Many SciPy functions return specialized result objects:

```python
result = scipy.optimize.minimize(func, x0)
result.x         # Solution
result.success   # Convergence status
result.message   # Status message
result.fun       # Function value at solution
```

**Benefits**:
- Structured output
- Easy attribute access
- Consistent across functions
- Self-documenting

In [6]:
from scipy.optimize import minimize

# Define a simple function: f(x) = x^2
def quadratic(x):
    return x**2

# Minimize starting from x0 = 5
result = minimize(quadratic, x0=5)

print("Optimization successful:", result.success)
print("Optimal x:", result.x)  # Should be close to 0
print("Minimum value:", result.fun)  # Should be close to 0
print("Number of iterations:", result.nit)
print("\nFull result object:")
print(result)

Optimization successful: True
Optimal x: [-2.62955131e-08]
Minimum value: 6.914540092077327e-16
Number of iterations: 3

Full result object:
  message: Optimization terminated successfully.
  success: True
   status: 0
      fun: 6.914540092077327e-16
        x: [-2.630e-08]
      nit: 3
      jac: [-3.769e-08]
 hess_inv: [[ 5.000e-01]]
     nfev: 8
     njev: 4


## Common Pitfalls & Solutions

### 1. **Forgetting NumPy Dependency**
```python
# Wrong: Using Python lists
data = [1, 2, 3, 4]

# Right: Use NumPy arrays
data = np.array([1, 2, 3, 4])
```

### 2. **Importing Entire SciPy**
```python
# Inefficient
import scipy
scipy.optimize.minimize(...)  # Loads all subpackages

# Efficient
from scipy import optimize
optimize.minimize(...)  # Only loads optimize
```

### 3. **Ignoring Return Types**
```python
# Check return type
result = some_scipy_function()
print(type(result))  # May be array, object, or tuple
```

### 4. **Not Checking Convergence**
```python
result = optimize.minimize(func, x0)
if not result.success:
    print("Warning:", result.message)
```

In [7]:
import numpy as np
from scipy import stats

# Pitfall 1: Python list vs NumPy array
python_list = [1, 2, 3, 4, 5]
numpy_array = np.array([1, 2, 3, 4, 5])

# SciPy accepts both but NumPy array is faster
mean_list = stats.tmean(python_list)
mean_array = stats.tmean(numpy_array)

print("Mean from list:", mean_list)
print("Mean from array:", mean_array)
print("\nRecommendation: Always use NumPy arrays for performance")

# Pitfall 2: Check return types
normal_dist = stats.norm(loc=0, scale=1)
samples = normal_dist.rvs(size=5)
print("\nSamples type:", type(samples))  # Returns ndarray
print("Distribution type:", type(normal_dist))  # Returns frozen distribution object

Mean from list: 3.0
Mean from array: 3.0

Recommendation: Always use NumPy arrays for performance

Samples type: <class 'numpy.ndarray'>
Distribution type: <class 'scipy.stats._distn_infrastructure.rv_continuous_frozen'>


## Performance Considerations

### SciPy Performance Characteristics:
1. **Compiled Code**: Most functions wrap C/Fortran libraries
2. **Vectorization**: Operations work on entire arrays
3. **Memory Efficiency**: In-place operations available where appropriate
4. **Algorithm Selection**: Multiple methods for same task

### Optimization Tips:
- Use appropriate `method` parameter
- Provide analytical derivatives when possible
- Choose right tolerance (`tol`)
- Use sparse matrices for large sparse data
- Leverage FFT for convolution operations

In [8]:
import numpy as np
from scipy import linalg
import time

# Create a large matrix
n = 500
A = np.random.rand(n, n)

# Time NumPy linear algebra
start = time.time()
det_np = np.linalg.det(A)
time_np = time.time() - start

# Time SciPy linear algebra
start = time.time()
det_sp = linalg.det(A)
time_sp = time.time() - start

print(f"NumPy time: {time_np:.6f}s")
print(f"SciPy time: {time_sp:.6f}s")
print(f"\nBoth use optimized LAPACK routines")
print(f"Performance difference is usually negligible")

NumPy time: 0.051499s
SciPy time: 0.004133s

Both use optimized LAPACK routines
Performance difference is usually negligible


## Additional Resources

### Documentation:
- Official docs: https://docs.scipy.org/
- API reference: https://docs.scipy.org/doc/scipy/reference/
- Tutorials: https://docs.scipy.org/doc/scipy/tutorial/

### Quick Reference Commands:
```python
# Check version
scipy.__version__

# List subpackages
dir(scipy)

# Function documentation
help(scipy.optimize.minimize)

# Subpackage documentation
help(scipy.optimize)
```

### Testing Installation:
```python
# Run full test suite (requires pytest)
scipy.test()

# Test specific subpackage
scipy.optimize.test()
```

In [9]:
import scipy
import numpy as np
import sys

print("=== Environment Information ===")
print(f"Python version: {sys.version}")
print(f"NumPy version: {np.__version__}")
print(f"SciPy version: {scipy.__version__}")
print("\n=== SciPy Configuration ===")
# scipy.show_config()  # Uncomment to see BLAS/LAPACK info

# Check if specific subpackages are available
try:
    from scipy import optimize, linalg, stats
    print("\nCore subpackages loaded successfully ✓")
except ImportError as e:
    print(f"Import error: {e}")

=== Environment Information ===
Python version: 3.13.9 | packaged by Anaconda, Inc. | (main, Oct 21 2025, 19:11:29) [Clang 20.1.8 ]
NumPy version: 2.1.3
SciPy version: 1.15.3

=== SciPy Configuration ===

Core subpackages loaded successfully ✓


## Summary: Key Takeaways

✓ **SciPy extends NumPy** with advanced scientific algorithms  
✓ **Organized by domain** into 15+ specialized subpackages  
✓ **Import selectively** - use `from scipy import subpackage`  
✓ **Works with NumPy arrays** - always prefer arrays over lists  
✓ **Returns result objects** - check `.success`, `.x`, `.fun` attributes  
✓ **Built on optimized libraries** - LAPACK, BLAS, FFTPACK for speed  
✓ **Check documentation** - use `help()`, `?`, or online docs  

### Next Steps:
1. Explore specific subpackages (optimization, stats, signal, etc.)
2. Practice with real datasets
3. Compare NumPy vs SciPy implementations
4. Learn when to use each library