MRA modeling fails when one of the datasets has a constant field

When doing exploratory testing with small subsets of data, I sometimes had situations where one of the columns in the test data was constant, i.e. every row had the same value for that subset of the data.

There is code in the MRA modeling method which adds a constant column to each dataframe, to represent the intercept value in the model: https://github.com/larsiusprime/openavmkit/blob/295c0808c36a1b9562f6472c6379cc93386f8097/openavmkit/modeling.py#L1657

The add_constant method does not add a new column if there's already a constant column in the dataframe. Because of this, if one of the columns happens to be constant in the X_test dataframe but not in the other dataframes, we end up in a situation where X_test has (n) columns and the other dataframes have (n + 1) columns. The mismatched column count leads to failures in the prediction phase after the fitted model is created.

I think using the has_constant='add' argument when running add_constant would prevent this issue - see https://www.statsmodels.org/stable/generated/statsmodels.tools.tools.add_constant.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MRA modeling fails when one of the datasets has a constant field #150

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

MRA modeling fails when one of the datasets has a constant field #150

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions