WeightedPandas extends pandas Series and DataFrame classes to support weighted operations. It provides drop-in replacements for pandas objects that automatically handle weights in statistical operations.
pip install weightedpandas- Python 3.8+
- pandas 1.4.0+
- numpy 1.20.0+
- Weighted versions of common statistical operations:
sum(),mean(),var(),std()median(),quantile()corr(),cov()
- Preserves weights through arithmetic operations
- Familiar pandas interface
- Supports both Series and DataFrame objects
import pandas as pd
import numpy as np
from weightedpandas import WeightedSeries, WeightedDataFrame
# Create a weighted series
data = [1, 2, 3, 4, 5]
weights = [5, 4, 3, 2, 1]
ws = WeightedSeries(data, weights=weights)
# Calculate weighted statistics
print(f"Weighted sum: {ws.sum()}")
print(f"Weighted mean: {ws.mean()}")
print(f"Weighted median: {ws.median()}")
print(f"Weighted standard deviation: {ws.std()}")
# Create a weighted dataframe
df_data = {
'A': [1, 2, 3, 4, 5],
'B': [5, 4, 3, 2, 1]
}
wdf = WeightedDataFrame(df_data, weights=weights)
# Calculate weighted statistics
print(wdf.sum())
print(wdf.mean())
print(wdf.corr())
# Weights are preserved through operations
ws2 = ws * 2 + 1
print(ws2.weights) # Same as original weightsIn weighted calculations:
sum(): Each value is multiplied by its weight before summingmean(): The weighted sum divided by the sum of weightsvar()andstd(): Each squared deviation is weightedquantile(): The quantile is determined from the weighted cumulative distribution
For convenience, you can use the following helper functions:
from weightedpandas import weighted_series, weighted_dataframe
# These are equivalent to the constructor calls
ws = weighted_series(data, weights=weights)
wdf = weighted_dataframe(df_data, weights=weights)This project is licensed under the MIT License - see the LICENSE file for details.