# Does `PyFixest` match `fixest`?

This vignette compares estimation results from `fixest` with `pyfixest` via the `rpy2` package.

## Setup

In [None]:
import pandas as pd
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri
from rpy2.robjects.packages import importr

import pyfixest as pf

# Activate pandas2ri
pandas2ri.activate()

# Import R packages
fixest = importr("fixest")
stats = importr("stats")
broom = importr("broom")

# IPython magic commands for autoreloading
%load_ext autoreload
%autoreload 2

# Get data using pyfixest
data = pf.get_data(model="Feols", N=10_000, seed=99292)

## Ordinary Least Squares (OLS)

### IID Inference

First, we estimate a model via `pyfixest`. We compute "iid" standard errors.

In [None]:
fit = pf.feols(fml="Y ~ X1 + X2 | f1 + f2", data=data, vcov="iid")

We estimate the same model with weights:

In [None]:
fit_weights = pf.feols(
    fml="Y ~ X1 + X2 | f1 + f2", data=data, weights="weights", vcov="iid"
)

Via `r-fixest` and `rpy2`, we get

In [None]:
# Re-activate pandas2ri to ensure conversion context is available
pandas2ri.activate()

r_fit = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov="iid",
)

r_fit_weights = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    weights=ro.Formula("~weights"),
    vcov="iid",
)

Let's compare how close the covariance matrices are:

In [None]:
fit_vcov = fit._vcov
r_vcov = stats.vcov(r_fit)
fit_vcov - r_vcov

And for WLS:

In [None]:
fit_weights._vcov - stats.vcov(r_fit_weights)

We conclude by comparing all estimation results via the `tidy` methods:

In [None]:
fit.tidy()

In [None]:
pd.DataFrame(broom.tidy_fixest(r_fit)).T

In [None]:
fit_weights.tidy()

In [None]:
pd.DataFrame(broom.tidy_fixest(r_fit_weights)).T

### Heteroskedastic Errors

We repeat the same exercise with heteroskedastic (HC1) errors:

In [None]:
fit = pf.feols(fml="Y ~ X1 + X2 | f1 + f2", data=data, vcov="hetero")
fit_weights = pf.feols(
    fml="Y ~ X1 + X2 | f1 + f2", data=data, vcov="hetero", weights="weights"
)

In [None]:
# Re-activate pandas2ri to ensure conversion context is available
pandas2ri.activate()

r_fit = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov="hetero",
)

r_fit_weights = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    weights=ro.Formula("~weights"),
    vcov="hetero",
)

As before, we compare the variance covariance matrices:

In [None]:
fit._vcov - stats.vcov(r_fit)

In [None]:
fit_weights._vcov - stats.vcov(r_fit_weights)

We conclude by comparing all estimation results via the `tidy` methods:

In [None]:
fit.tidy()

In [None]:
pd.DataFrame(broom.tidy_fixest(r_fit)).T

In [None]:
fit_weights.tidy()

In [None]:
pd.DataFrame(broom.tidy_fixest(r_fit_weights)).T

### Cluster-Robust Errors

We conclude with cluster robust errors.

In [None]:
fit = pf.feols(fml="Y ~ X1 + X2 | f1 + f2", data=data, vcov={"CRV1": "f1"})
fit_weights = pf.feols(
    fml="Y ~ X1 + X2 | f1 + f2", data=data, vcov={"CRV1": "f1"}, weights="weights"
)

# Re-activate pandas2ri to ensure conversion context is available
pandas2ri.activate()

r_fit = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov=ro.Formula("~f1"),
)
r_fit_weights = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    weights=ro.Formula("~weights"),
    vcov=ro.Formula("~f1"),
)

In [None]:
fit._vcov - stats.vcov(r_fit)

In [None]:
fit_weights._vcov - stats.vcov(r_fit_weights)

We conclude by comparing all estimation results via the `tidy` methods:

In [None]:
fit.tidy()

In [None]:
pd.DataFrame(broom.tidy_fixest(r_fit)).T

In [None]:
fit_weights.tidy()

In [None]:
pd.DataFrame(broom.tidy_fixest(r_fit_weights)).T

## Poisson Regression

In [None]:
data = pf.get_data(model="Fepois")

In [None]:
fit_iid = pf.fepois(fml="Y ~ X1 + X2 | f1 + f2", data=data, vcov="iid", iwls_tol=1e-10)
fit_hetero = pf.fepois(
    fml="Y ~ X1 + X2 | f1 + f2", data=data, vcov="hetero", iwls_tol=1e-10
)
fit_crv = pf.fepois(
    fml="Y ~ X1 + X2 | f1 + f2", data=data, vcov={"CRV1": "f1"}, iwls_tol=1e-10
)

# Re-activate pandas2ri to ensure conversion context is available
pandas2ri.activate()

fit_r_iid = fixest.fepois(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov="iid",
)

fit_r_hetero = fixest.fepois(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov="hetero",
)

fit_r_crv = fixest.fepois(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov=ro.Formula("~f1"),
)

In [None]:
fit_iid._vcov - stats.vcov(fit_r_iid)

In [None]:
fit_hetero._vcov - stats.vcov(fit_r_hetero)

In [None]:
fit_crv._vcov - stats.vcov(fit_r_crv)

We conclude by comparing all estimation results via the `tidy` methods:

In [None]:
fit_iid.tidy()

In [None]:
pd.DataFrame(broom.tidy_fixest(fit_r_iid)).T

In [None]:
fit_hetero.tidy()

In [None]:
pd.DataFrame(broom.tidy_fixest(fit_r_hetero)).T

In [None]:
fit_crv.tidy()

In [None]:
pd.DataFrame(broom.tidy_fixest(fit_r_crv)).T