In [None]:
# Julia


## Introduction

In labor economics an important question is what determines the wage of workers. This is a causal question, but we could begin to investigate from a predictive perspective.

In the following wage example, $Y$ is the hourly wage of a worker and $X$ is a vector of worker's characteristics, e.g., education, experience, gender.
Two main questions here are:    

* How to use job-relevant characteristics, such as education and experience, to best predict wages?

* What is the difference in predicted wages between men and women with the same job-relevant characteristics?

In this lab, we focus on the prediction question first.

In [None]:
## Data

In [None]:
The data set we consider is from the March Supplement of the U.S. Current Population Survey, year 2015. We select white non-hispanic individuals, aged 25 to 64 years, and working more than 35 hours per week during at least 50 weeks of the year. We exclude self-employed workers; individuals living in group quarters; individuals in the military, agricultural or private household sectors; individuals with inconsistent reports on earnings and employment status; individuals with allocated or missing information in any of the variables used in the analysis; and individuals with hourly wage below $3$.

The variable of interest $Y$ is the hourly wage rate constructed as the ratio of the annual earnings to the total number of hours worked, which is constructed in turn as the product of number of weeks worked and the usual number of hours worked per week. In our analysis, we also focus on single (never married) workers. The final sample is of size $n = 5150$.

In [None]:
## Data Analysis

In [1]:
using Pkg
Pkg.add("CSV")
Pkg.add("DataFrames")
Pkg.add("Dates")
Pkg.add("Plots")
using CSV
using DataFrames
using Dates
using Plots

[32m[1m    Updating[22m[39m registry at `C:\Users\Carol\.julia\registries\General.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Manifest.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Manifest.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Manifest.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environme

In [5]:
#Reading the CSV file into a DataFrame
#We have to set the category type for some variable
data = CSV.File("C:/Users/Carol/OneDrive/Documents/ECO224/data/wage2015_subsample_inference.csv"; types = Dict("occ" => String,"occ2"=> String,"ind"=>String,"ind2"=>String)) |> DataFrame
println("Number of Rows : ", size(data)[1],"\n","Number of Columns : ", size(data)[2],) #rows

Number of Rows : 5150
Number of Columns : 21


In [6]:
[eltype(col) for col = eachcol(data)]

21-element Vector{DataType}:
 Int64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 Float64
 String
 String
 String
 String

In [7]:
first(data,10)

Unnamed: 0_level_0,rownames,wage,lwage,sex,shs,hsg,scl,clg,ad
Unnamed: 0_level_1,Int64,Float64,Float64,Float64,Float64,Float64,Float64,Float64,Float64
1,10,9.61538,2.26336,1.0,0.0,0.0,0.0,1.0,0.0
2,12,48.0769,3.8728,0.0,0.0,0.0,0.0,1.0,0.0
3,15,11.0577,2.40313,0.0,0.0,1.0,0.0,0.0,0.0
4,18,13.9423,2.63493,1.0,0.0,0.0,0.0,0.0,1.0
5,19,28.8462,3.36198,1.0,0.0,0.0,0.0,1.0,0.0
6,30,11.7308,2.46222,1.0,0.0,0.0,0.0,1.0,0.0
7,43,19.2308,2.95651,1.0,0.0,1.0,0.0,0.0,0.0
8,44,19.2308,2.95651,0.0,0.0,1.0,0.0,0.0,0.0
9,47,12.0,2.48491,1.0,0.0,1.0,0.0,0.0,0.0
10,71,19.2308,2.95651,1.0,0.0,0.0,0.0,1.0,0.0


In [8]:
describe(data)

Unnamed: 0_level_0,variable,mean,min,median,max,nunique,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Union…,Nothing,DataType
1,rownames,15636.3,10.0,15260.0,32643.0,,,Int64
2,wage,23.4104,3.02198,19.2308,528.846,,,Float64
3,lwage,2.97079,1.10591,2.95651,6.2707,,,Float64
4,sex,0.444466,0.0,0.0,1.0,,,Float64
5,shs,0.023301,0.0,0.0,1.0,,,Float64
6,hsg,0.243883,0.0,0.0,1.0,,,Float64
7,scl,0.278058,0.0,0.0,1.0,,,Float64
8,clg,0.31767,0.0,0.0,1.0,,,Float64
9,ad,0.137087,0.0,0.0,1.0,,,Float64
10,mw,0.259612,0.0,0.0,1.0,,,Float64


In [9]:
n = size(data)[1]
z = select(data, Not([:rownames, :lwage, :wage]))
p = size(z)[2] 

println("Number of observations : ", n, "\n","Number of raw regressors:", p )

Number of observations : 5150
Number of raw regressors:18


In [10]:
z_subset = select(data, ["lwage","sex","shs","hsg","scl","clg","ad","mw","so","we","ne","exp1"])
describe(z_subset, :mean)

Unnamed: 0_level_0,variable,mean
Unnamed: 0_level_1,Symbol,Float64
1,lwage,2.97079
2,sex,0.444466
3,shs,0.023301
4,hsg,0.243883
5,scl,0.278058
6,clg,0.31767
7,ad,0.137087
8,mw,0.259612
9,so,0.296505
10,we,0.216117


In [None]:
## Prediction Question

In [None]:
Now, we will construct a prediction rule for hourly wage $Y$ , which depends linearly on job-relevant characteristics  $X$:

$$Y = \beta' X + \epsilon $$
 
Our goals are

* Predict wages using various characteristics of workers.

* Assess the predictive performance using the (adjusted) sample MSE, the (adjusted) sample $R^2$ and the out-of-sample $MSE$ and $R^2$.

We employ two different specifications for prediction:

- **Basic Model**: $X$ consists of a set of raw regressors (e.g. gender, experience, education indicators, occupation and industry indicators, regional indicators).

- **Flexible Model**: $X$ consists of all raw regressors from the basic model plus occupation and industry indicators, transformations (e.g.,$exp2$ and $exp3$) and additional two-way interactions of polynomial in experience with other regressors. An example of a regressor created through a two-way interaction is experience times the indicator of having a college degree.

Using the **Flexible Model**, enables us to approximate the real relationship by a more complex regression model and therefore to reduce the bias. The **Flexible Model** increases the range of potential shapes of the estimated regression function. In general, flexible models often deliver good prediction accuracy but give models which are harder to interpret.

Now, let us fit both models to our data by running ordinary least squares (ols):

In [16]:

Pkg.add("Plots")
Pkg.add("Lathe")
Pkg.add("GLM")
Pkg.add("StatsPlots")
Pkg.add("MLBase")
Pkg.add("StatsModels")
Pkg.add("Combinatorics")
# Load the installed packages
using DataFrames
using CSV
using Plots
using Lathe
using GLM
using Statistics
using StatsPlots
using MLBase
using StatsModels
using Combinatorics

[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Manifest.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Manifest.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Manifest.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Manifest.toml`
[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Ch

In [17]:
#basic model
basic  = @formula(lwage ~ (sex + exp1 + shs + hsg + mw + so + we + occ2+ ind2))
basic_results  = lm(basic, data)

StatsModels.TableRegressionModel{LinearModel{GLM.LmResp{Vector{Float64}}, GLM.DensePredChol{Float64, LinearAlgebra.CholeskyPivoted{Float64, Matrix{Float64}}}}, Matrix{Float64}}

lwage ~ 1 + sex + exp1 + shs + hsg + mw + so + we + occ2 + ind2

Coefficients:
─────────────────────────────────────────────────────────────────────────────────
                   Coef.   Std. Error       t  Pr(>|t|)    Lower 95%    Upper 95%
─────────────────────────────────────────────────────────────────────────────────
(Intercept)   3.31109     0.0522344     63.39    <1e-99   3.20869      3.41349
sex          -0.0752517   0.0154463     -4.87    <1e-05  -0.105533    -0.0449702
exp1          0.00739823  0.000666434   11.10    <1e-27   0.00609173   0.00870472
shs          -0.276334    0.04704       -5.87    <1e-08  -0.368553    -0.184115
hsg          -0.20696     0.0179454    -11.53    <1e-29  -0.242141    -0.171779
mw           -0.0541566   0.019798      -2.74    0.0063  -0.0929691   -0.0153441
so           -

In [31]:
# couples variables combinations
combinations_upto(x, n) = Iterators.flatten(combinations(x, i) for i in 1:n)
# combinations without same couple
expand_exp(args, deg::ConstantTerm) =
    tuple(((&)(terms...) for terms in combinations_upto(args, deg.n))...)

StatsModels.apply_schema(t::FunctionTerm{typeof(^)}, sch::StatsModels.Schema, ctx::Type) =
    apply_schema.(expand_exp(t.args_parsed...), Ref(sch), ctx)


In [33]:
#flexible model
flex = @formula(lwage ~ sex + (exp1+exp2+exp3+exp4 +shs+hsg+occ2+ind2+mw+so+we)^2)
flexi = @formula(sex ~ (exp1+exp2+exp3+exp4 +shs+hsg+occ2+ind2+mw+so+we)^2)
regflex = lm(flex, data)
regflex = lm(flexi, data)

LoadError: MethodError: no method matching apply_schema(::Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}, CategoricalTerm{DummyCoding, String, 20}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 21}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 21}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 21}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 21}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 20}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 20}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 20}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}}, ::StatsModels.FullRank, ::Type{LinearModel}, ::Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}, CategoricalTerm{DummyCoding, String, 20}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, ContinuousTerm{Float64}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 21}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 21}, CategoricalTerm{DummyCoding, String, 20}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 21}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 21}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 21}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 20}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 20}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{CategoricalTerm{DummyCoding, String, 20}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}, InteractionTerm{Tuple{ContinuousTerm{Float64}, ContinuousTerm{Float64}}}})
[0mClosest candidates are:
[0m  apply_schema(::Tuple{Union{Tuple{AbstractTerm, Vararg{AbstractTerm}}, AbstractTerm}, Vararg{Union{Tuple{AbstractTerm, Vararg{AbstractTerm}}, AbstractTerm}}}, ::Any, ::Type) at C:\Users\Carol\.julia\packages\StatsModels\57Kc9\src\schema.jl:230
[0m  apply_schema(::Any, ::StatsModels.FullRank, ::Type, [91m::AbstractTerm[39m) at C:\Users\Carol\.julia\packages\StatsModels\57Kc9\src\schema.jl:349
[0m  apply_schema(::Any, ::Any, ::Type) at C:\Users\Carol\.julia\packages\StatsModels\57Kc9\src\schema.jl:229
[0m  ...

In [None]:
## Lasso

In [26]:
Pkg.add("Lasso")
using Lasso

[32m[1m   Resolving[22m[39m package versions...
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Project.toml`
[32m[1m  No Changes[22m[39m to `C:\Users\Carol\.julia\environments\v1.7\Manifest.toml`


In [None]:
flex = @formula(lwage ~ sex + (exp1+exp2+exp3+exp4 +shs+hsg+occ2+ind2+mw+so+we)^2)
flexi = @formula(sex ~ (exp1+exp2+exp3+exp4 +shs+hsg+occ2+ind2+mw+so+we)^2)
lasso_model = fit(LassoModel, flex, data)
lasso_model = fit(LassoModel, flexi, data)

