{{ message }}

# jsvine / weightedcalcs

Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.

## Files

Failed to load latest commit information.
Type
Name
Commit time

# weightedcalcs

`weightedcalcs` is a `pandas`-based Python library for calculating weighted means, medians, standard deviations, and more.

## Features

• Plays well with `pandas`.
• Support for weighted means, medians, quantiles, standard deviations, and distributions.
• Support for grouped calculations, using `DataFrameGroupBy` objects.
• Raises an error when your data contains null-values.
• Full test coverage.

## Installation

`pip install weightedcalcs`

## Usage

### Getting started

Every weighted calculation in `weightedcalcs` begins with an instance of the `weightedcalcs.Calculator` class. `Calculator` takes one argument: the name of your weighting variable. So if you're analyzing a survey where the weighting variable is called `"resp_weight"`, you'd do this:

```import weightedcalcs as wc
calc = wc.Calculator("resp_weight")```

### Types of calculations

Currently, `weightedcalcs.Calculator` supports the following calculations:

• `calc.mean(my_data, value_var)`: The weighted arithmetic average of `value_var`.
• `calc.quantile(my_data, value_var, q)`: The weighted quantile of `value_var`, where `q` is between 0 and 1.
• `calc.median(my_data, value_var)`: The weighted median of `value_var`, equivalent to `.quantile(...)` where `q=0.5`.
• `calc.std(my_data, value_var)`: The weighted standard deviation of `value_var`.
• `calc.distribution(my_data, value_var)`: The weighted proportions of `value_var`, interpreting `value_var` as categories.
• `calc.count(my_data)`: The weighted count of all observations, i.e., the total weight.
• `calc.sum(my_data, value_var)`: The weighted sum of `value_var`.

The `obj` parameter above should one of the following:

• A `pandas` `DataFrame` object
• A `pandas` `DataFrame.groupby` object
• A plain Python dictionary where the keys are column names and the values are equal-length lists.

### Basic example

Below is a basic example of using `weightedcalcs` to find what percentage of Wyoming residents are married, divorced, et cetera:

```import pandas as pd
import weightedcalcs as wc

# Load the 2015 American Community Survey person-level responses for Wyoming

# `PWGTP` is the weighting variable used in the ACS's person-level data
calc = wc.Calculator("PWGTP")

# Get the distribution of marriage-status responses
calc.distribution(responses, "marriage_status").round(3).sort_values(ascending=False)

# -- Output --
# marriage_status
# Married                                0.425
# Never married or under 15 years old    0.421
# Divorced                               0.097
# Widowed                                0.046
# Separated                              0.012
# Name: PWGTP, dtype: float64```

### More examples

See this notebook to see examples of other calculations, including grouped calculations.

Max Ghenis has created a version of the example notebook that can be run directly in your browser, via Google Colab.

## Other Python weighted-calculation libraries

Pandas-based utility to calculate weighted means, medians, distributions, standard deviations, and more.

4 tags

## Packages 0

No packages published

•
•

## Languages

You can’t perform that action at this time.