Skip to content

LASSO Implementation: Condition bug, missing normalization documentation, and intercept fitting option #342

@georeth

Description

@georeth

Bug: Incorrect condition for dimension check

if n <= p {
return Err(Failed::fit(
"Number of rows in X should be >= number of columns in X",
));
}

The condition n <= p is inconsistent with the error message, it should be n < p. (my case is full-rank)

L1 penalty normalization not documented

//! \\[\underset{\beta}{minimize} \space \space \sum_{i=1}^n \left( y_i - \beta_0 - \sum_{j=1}^p \beta_jx_{ij} \right)^2 + \alpha \sum_{j=1}^p \lVert \beta_j \rVert_1\\]

the penalty in docs is alpha * sum(abs(beta)), but I think it's alpha * n * sum(abs(beta)) in the code, where n is the number of observations.

Feature Request: Option to disable intercept fitting

let y = y.sub_scalar(T::from_f64(y.mean_by()).unwrap());

Currently, the mean is always substracted from y. Is it possible to add a parameter (e.g., fit_intercept: bool, or more specific substract_mean_from_y: bool) to allow users to disable intercept fitting? (force the beta_0 parameter in the formula to 0)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions