(weight-proportions)=
# Compute age- and length-binned weight proportions

In [2]:
%run ./number_proportions.ipynb

Unnamed: 0,stratum_ks,length_bin,age_bin,sex,count
0,0,"(1.0, 3.0]","(0.5, 1.5]",female,0
1,0,"(1.0, 3.0]","(0.5, 1.5]",male,0
2,0,"(1.0, 3.0]","(0.5, 1.5]",unsexed,0
3,0,"(1.0, 3.0]","(1.5, 2.5]",female,0
4,0,"(1.0, 3.0]","(1.5, 2.5]",male,0
5,0,"(1.0, 3.0]","(1.5, 2.5]",unsexed,0
6,0,"(1.0, 3.0]","(2.5, 3.5]",female,0
7,0,"(1.0, 3.0]","(2.5, 3.5]",male,0
8,0,"(1.0, 3.0]","(2.5, 3.5]",unsexed,0
9,0,"(1.0, 3.0]","(3.5, 4.5]",female,0


Unnamed: 0,stratum_ks,length_bin,sex,count
0,1,"(1.0, 3.0]",female,0
1,1,"(1.0, 3.0]",male,0
2,1,"(1.0, 3.0]",unsexed,0
3,1,"(3.0, 5.0]",female,0
4,1,"(3.0, 5.0]",male,0
5,1,"(3.0, 5.0]",unsexed,0
6,1,"(5.0, 7.0]",female,0
7,1,"(5.0, 7.0]",male,0
8,1,"(5.0, 7.0]",unsexed,0
9,1,"(7.0, 9.0]",female,0


## Setting up the environment

With the biological data loaded and number proportions computed, we can now compute the weight proportions. The first step is to distribute binned weights over age, length, and sex across strata. 

## Distributed weights over age, length, and sex across strata

The binned weight distributions are computed over values specific to aged (`dict_df_bio["specimen"]`) and unaged fish (`dict_df_bio["length"]`). This is done using the `binned_weights` function from the `survey.proportions` module and is done separately since the two datasets are processed somewhat differently before being combined in a later step. Here, we will use `"stratum_ks"` as our stratification definition.

In [3]:
# Pre-allocate a dictionary
dict_df_weight_distr = {}

# Aged
dict_df_weight_distr["aged"] = get_proportions.binned_weights(
    length_dataset=dict_df_bio["specimen"],
    include_filter = {"sex": ["female", "male"]},
    interpolate_regression=False,
    contrast_vars="sex",
    table_cols=["stratum_ks", "sex", "age_bin"]
)

# Unaged
dict_df_weight_distr["unaged"] = get_proportions.binned_weights(
    length_dataset=dict_df_bio["length"],
    length_weight_dataset=binned_weight_table,
    include_filter = {"sex": ["female", "male"]},
    interpolate_regression=True,
    contrast_vars="sex",
    table_cols=["stratum_ks", "sex"]
)


Unlike the weight proportions, this produces a multiindex `pandas.DataFrame` where columns are indexed by `(sex, stratum_ks)`, while rows are indexed by either just `length_bin`. For the aged animals, the columns comprise one additional level, `age_bin`. This results in the top 10 rows of `dict_df_weight_distr["aged"]` appearing as:

In [4]:
from IPython.display import display

display(dict_df_weight_distr["aged"].head(10))

sex,female,female,female,female,female,female,female,female,female,female,...,male,male,male,male,male,male,male,male,male,male
age_bin,"(0.5, 1.5]","(0.5, 1.5]","(0.5, 1.5]","(0.5, 1.5]","(0.5, 1.5]","(0.5, 1.5]","(0.5, 1.5]","(0.5, 1.5]","(0.5, 1.5]","(1.5, 2.5]",...,"(20.5, 21.5]","(21.5, 22.5]","(21.5, 22.5]","(21.5, 22.5]","(21.5, 22.5]","(21.5, 22.5]","(21.5, 22.5]","(21.5, 22.5]","(21.5, 22.5]","(21.5, 22.5]"
stratum_ks,0,1,2,3,4,5,6,7,8,0,...,8,0,1,2,3,4,5,6,7,8
length_bin,Unnamed: 1_level_3,Unnamed: 2_level_3,Unnamed: 3_level_3,Unnamed: 4_level_3,Unnamed: 5_level_3,Unnamed: 6_level_3,Unnamed: 7_level_3,Unnamed: 8_level_3,Unnamed: 9_level_3,Unnamed: 10_level_3,Unnamed: 11_level_3,Unnamed: 12_level_3,Unnamed: 13_level_3,Unnamed: 14_level_3,Unnamed: 15_level_3,Unnamed: 16_level_3,Unnamed: 17_level_3,Unnamed: 18_level_3,Unnamed: 19_level_3,Unnamed: 20_level_3,Unnamed: 21_level_3
"(1.0, 3.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(3.0, 5.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(5.0, 7.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(7.0, 9.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(9.0, 11.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(11.0, 13.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(13.0, 15.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(15.0, 17.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(17.0, 19.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(19.0, 21.0]",0.0,0.756,0.876,0.052,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


The unaged values in `dict_df_weight_distr["unaged"]` then appear as:

In [5]:
display(dict_df_weight_distr["unaged"].head(10))

sex,female,female,female,female,female,female,female,female,male,male,male,male,male,male,male,male
stratum_ks,1,2,3,4,5,6,7,8,1,2,3,4,5,6,7,8
length_bin,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2
"(1.0, 3.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(3.0, 5.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(5.0, 7.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(7.0, 9.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(9.0, 11.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(11.0, 13.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(13.0, 15.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(15.0, 17.0]",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(17.0, 19.0]",0.0,0.31356,0.0,0.0,0.0,0.0,0.0,0.0,0.041328,0.197134,0.0,0.0,0.0,0.0,0.0,0.0
"(19.0, 21.0]",3.241816,3.987102,0.615512,0.0,0.0,0.0,0.0,0.0,4.738971,3.1418,1.293984,0.0,0.0,0.0,0.0,0.0


## Convert summed weights into weight proportions

These binned weights are normalized into proportions using the `weight_proportions` function. However, there is a slightly different treatment done here due to the aforementioned separate processing of aged and unaged samples. So we first compute the weight proportions for aged-only animals, where we are using the summed *combined* weights (i.e. from the individual specimens and haul weight totals) to compute the proportions. This effectively represents the same quantity as `proportion_overall` present in the number proportions `pandas.DataFrame`. 

In [6]:
# Initialize Dictionary container
dict_df_weight_proportion = {}

# Aged
dict_df_weight_proportion["aged"] = get_proportions.weight_proportions(
    weight_data=dict_df_weight_distr, 
    catch_data=dict_df_bio["catch"], 
    group="aged",
    stratum_col="stratum_ks"
)

This outputs a multiindex `pandas.DataFrame` whose columns are `stratum_ks` and row indices of (`length_bin`, `sex`, `age_bin`):

In [7]:
display(dict_df_weight_proportion["aged"].head(10))

Unnamed: 0_level_0,Unnamed: 1_level_0,stratum_ks,0,1,2,3,4,5,6,7,8
length_bin,sex,age_bin,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
"(1.0, 3.0]",female,"(0.5, 1.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(1.0, 3.0]",female,"(1.5, 2.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(1.0, 3.0]",female,"(2.5, 3.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(1.0, 3.0]",female,"(3.5, 4.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(1.0, 3.0]",female,"(4.5, 5.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(1.0, 3.0]",female,"(5.5, 6.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(1.0, 3.0]",female,"(6.5, 7.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(1.0, 3.0]",female,"(7.5, 8.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(1.0, 3.0]",female,"(8.5, 9.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"(1.0, 3.0]",female,"(9.5, 10.5]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


### Scaling weight proportions for unaged fish

The first step for the unaged weight proportions involves scaling the binned weights using the total haul catch weights as a reference via the `scale_weights_by_stratum` function.

In [8]:
# Scale haul weights for unaged fish
scaled_sexed_unaged_weights_df = get_proportions.scale_weights_by_stratum(
    weights_df=dict_df_weight_distr["unaged"], 
    reference_weights_df=dict_df_bio["catch"].groupby(["stratum_ks"])["weight"].sum(),
    stratum_col="stratum_ks",
)

We can then use these scaled weights to compute the unaged weight proportions via `scale_weight_proportions`, which is a more complicated task than the aged fish. It incorporates the aged and unaged number proportions (to estimate sex ratios), binned weights computed for *all* fish, and the total haul weights.

In [9]:
# Compute the scaled weight proportions for unaged fish

dict_df_weight_proportion["unaged"] = get_proportions.scale_weight_proportions(
    weight_data=scaled_sexed_unaged_weights_df, 
    reference_weight_proportions=dict_df_weight_proportion["aged"], 
    catch_data=dict_df_bio["catch"], 
    number_proportions=dict_df_number_proportion,
    binned_weights=binned_weight_table["all"],
    group="unaged",
    group_columns = ["sex"],
    stratum_col = "stratum_ks"
)

This produces a similar table as the aged fish: 

In [10]:
display(dict_df_weight_proportion["unaged"].head(10))

Unnamed: 0_level_0,stratum_ks,0,1,2,3,4,5,6,7,8
sex,length_bin,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
female,"(1.0, 3.0]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
female,"(3.0, 5.0]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
female,"(5.0, 7.0]",,0.0,1.9e-05,0.0,0.0,0.0,0.0,0.0,0.0
female,"(7.0, 9.0]",,0.0,4.6e-05,0.0,0.0,0.0,0.0,0.0,0.0
female,"(9.0, 11.0]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
female,"(11.0, 13.0]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
female,"(13.0, 15.0]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
female,"(15.0, 17.0]",,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
female,"(17.0, 19.0]",,0.000243,0.027601,0.0,0.0,0.0,0.0,0.0,0.0
female,"(19.0, 21.0]",,0.05551,0.307074,0.011452,0.0,0.0,0.0,0.0,0.0
