# Does `PyFixest` match `fixest`? 

This vignette compares estimation results from `fixest` with `pyfixest` via the `rpy2` package.

In [53]:
import pandas as pd
import rpy2.robjects as ro
from rpy2.robjects import pandas2ri
from rpy2.robjects.packages import importr

import pyfixest as pf

# Activate pandas2ri
pandas2ri.activate()

# Import R packages
fixest = importr("fixest")
stats = importr("stats")

# IPython magic commands for autoreloading
%load_ext autoreload
%autoreload 2

# Get data using pyfixest
data = pf.get_data(model="Feols", N=1000)

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


## Ordinary Least Squares (OLS)

### IID Inference

First, we estimate a model via `pyfixest. We compute "iid" standard errors. 

In [54]:
fit = pf.feols("Y ~ X1 + X2 | f1 + f2", data = data, ssc = pf.ssc(adj = True, cluster_adj = False), vcov = "iid")

We estimate the same model with weights: 

In [55]:
fit_weights = pf.feols("Y ~ X1 + X2 | f1 + f2", data = data, weights = "weights", ssc = pf.ssc(adj = True, cluster_adj = False), vcov = "iid")

Via `r-fixest` and `rpy2`, we get

In [56]:
r_fit = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov="iid",
    ssc=fixest.ssc(True, "none", False, "min", "min", False),
)

r_fit_weights = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    weights=ro.Formula("~weights"),
    vcov="iid",
    ssc=fixest.ssc(True, "none", False, "min", "min", False),
)

R[write to console]: NOTE: 3 observations removed because of NA values (LHS: 1, RHS: 1, Fixed-effects: 1).

R[write to console]: NOTE: 3 observations removed because of NA values (LHS: 1, RHS: 1, Fixed-effects: 1).



Let's compare how close the covariance matrices are: 

In [57]:
fit_vcov = fit._vcov
r_vcov = stats.vcov(r_fit)
fit_vcov - r_vcov

array([[-3.46944695e-18, -2.62580214e-20],
       [-2.62580214e-20,  5.42101086e-20]])

And for WLS: 

In [58]:
fit_weights._vcov - stats.vcov(r_fit_weights)

array([[ 0.00000000e+00, -1.60936260e-20],
       [-1.60936260e-20, -1.08420217e-19]])

We conclude by comparing all estimation results via the `etable` function: 

In [59]:
pf.etable([fit, fit_weights], digits = 6)

                                 est1                     est2
------------  -----------------------  -----------------------
depvar                              Y                        Y
--------------------------------------------------------------
X1            -0.924046*** (0.054373)  -0.854266*** (0.054288)
X2            -0.174107*** (0.014412)  -0.164147*** (0.014849)
--------------------------------------------------------------
f2                                  x                        x
f1                                  x                        x
--------------------------------------------------------------
R2                           0.659044                        -
S.E. type                         iid                      iid
Observations                      997                      997
--------------------------------------------------------------
Significance levels: * p < 0.05, ** p < 0.01, *** p < 0.001
Format of coefficient cell:
Coefficient (Std. Error)


In [60]:
pd.DataFrame(fixest.etable(r_fit, r_fit_weights, digits = 6)).T

Unnamed: 0,0,1,2
0,Dependent Var.:,Y,Y
1,,,
2,X1,-0.924046*** (0.054373),-0.854266*** (0.054288)
3,X2,-0.174107*** (0.014412),-0.164147*** (0.014849)
4,Fixed-Effects:,-----------------------,-----------------------
5,f1,Yes,Yes
6,f2,Yes,Yes
7,_______________,_______________________,_______________________
8,S.E. type,IID,IID
9,Observations,997,997


### Heteroskedastic Errors

We repeat the same exercise with heteroskedastic (HC1) errors: 

In [61]:
fit = pf.feols("Y ~ X1 + X2 | f1 + f2", data = data, ssc = pf.ssc(adj = True, cluster_adj = False), vcov = "hetero")
fit_weights = pf.feols("Y ~ X1 + X2 | f1 + f2", data = data, ssc = pf.ssc(adj = True, cluster_adj = False), vcov = "hetero", weights = "weights")

In [62]:
fit_r = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov="hetero",
    ssc=fixest.ssc(True, "none", False, "min", "min", False),
)

fit_weights_r = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    weights=ro.Formula("~weights"),
    vcov="hetero",
    ssc=fixest.ssc(True, "none", False, "min", "min", False),
)

R[write to console]: NOTE: 3 observations removed because of NA values (LHS: 1, RHS: 1, Fixed-effects: 1).

R[write to console]: NOTE: 3 observations removed because of NA values (LHS: 1, RHS: 1, Fixed-effects: 1).



As before, we compare the variance covariance matrices:

In [63]:
fit._vcov - stats.vcov(fit_r)

array([[ 2.61539922e-14,  6.55185745e-14],
       [ 6.55185745e-14, -8.23562681e-15]])

In [64]:
fit_weights._vcov - stats.vcov(fit_weights_r)

array([[7.61158497e-13, 3.29128177e-13],
       [3.29128177e-13, 9.55965450e-14]])

In [65]:
pf.etable([fit, fit_weights], digits = 6)

                                 est1                     est2
------------  -----------------------  -----------------------
depvar                              Y                        Y
--------------------------------------------------------------
X1            -0.924046*** (0.054704)  -0.854266*** (0.063090)
X2            -0.174107*** (0.015009)  -0.164147*** (0.016769)
--------------------------------------------------------------
f2                                  x                        x
f1                                  x                        x
--------------------------------------------------------------
R2                           0.659044                        -
S.E. type                      hetero                   hetero
Observations                      997                      997
--------------------------------------------------------------
Significance levels: * p < 0.05, ** p < 0.01, *** p < 0.001
Format of coefficient cell:
Coefficient (Std. Error)


In [66]:
pd.DataFrame(fixest.etable(fit_r, fit_weights_r, digits = 6)).T

Unnamed: 0,0,1,2
0,Dependent Var.:,Y,Y
1,,,
2,X1,-0.924046*** (0.054704),-0.854266*** (0.063090)
3,X2,-0.174107*** (0.015009),-0.164147*** (0.016769)
4,Fixed-Effects:,-----------------------,-----------------------
5,f1,Yes,Yes
6,f2,Yes,Yes
7,_______________,_______________________,_______________________
8,S.E. type,Heteroskedasticity-rob.,Heteroskedasticity-rob.
9,Observations,997,997


### Cluster-Robust Errors

We conclude with cluster robust errors. 

In [67]:
fit = pf.feols("Y ~ X1 + X2 | f1 + f2", data = data, ssc = pf.ssc(adj = True, cluster_adj = False), vcov = {"CRV1": "f1"})
fit_weights = pf.feols("Y ~ X1 + X2 | f1 + f2", data = data, ssc = pf.ssc(adj = True, cluster_adj = False), vcov = {"CRV1":"f1"}, weights = "weights")

fit_r = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov= ro.Formula("~f1"),
    ssc=fixest.ssc(True, "none", False, "min", "min", False),
)
fit_r_weights = fixest.feols(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    weights=ro.Formula("~weights"),
    vcov= ro.Formula("~f1"),
    ssc=fixest.ssc(True, "none", False, "min", "min", False),
)

R[write to console]: NOTE: 3 observations removed because of NA values (LHS: 1, RHS: 1, Fixed-effects: 1).

R[write to console]: NOTE: 3 observations removed because of NA values (LHS: 1, RHS: 1, Fixed-effects: 1).



In [68]:
fit._vcov - stats.vcov(fit_r)

array([[-8.45259193e-13,  8.24642309e-15],
       [ 8.24642140e-15, -9.52734528e-15]])

In [69]:
fit_weights._vcov - stats.vcov(fit_weights_r)

array([[-2.75502512e-04,  7.09991484e-05],
       [ 7.09991484e-05,  2.83672348e-06]])

In [70]:
pf.etable([fit, fit_weights], digits = 6)

                                 est1                     est2
------------  -----------------------  -----------------------
depvar                              Y                        Y
--------------------------------------------------------------
X1            -0.924046*** (0.059910)  -0.854266*** (0.060868)
X2            -0.174107*** (0.014363)  -0.164147*** (0.016854)
--------------------------------------------------------------
f2                                  x                        x
f1                                  x                        x
--------------------------------------------------------------
R2                           0.659044                        -
S.E. type                      by: f1                   by: f1
Observations                      997                      997
--------------------------------------------------------------
Significance levels: * p < 0.05, ** p < 0.01, *** p < 0.001
Format of coefficient cell:
Coefficient (Std. Error)


In [71]:
pd.DataFrame(fixest.etable(fit_r, fit_r_weights, digits = 6)).T

Unnamed: 0,0,1,2
0,Dependent Var.:,Y,Y
1,,,
2,X1,-0.924046*** (0.059910),-0.854266*** (0.060868)
3,X2,-0.174107*** (0.014363),-0.164147*** (0.016854)
4,Fixed-Effects:,-----------------------,-----------------------
5,f1,Yes,Yes
6,f2,Yes,Yes
7,_______________,_______________________,_______________________
8,S.E.: Clustered,by: f1,by: f1
9,Observations,997,997


## Poisson Regression

In [72]:
data = pf.get_data(model = "Fepois")

In [73]:
fit_iid = pf.fepois("Y ~ X1 + X2 | f1 + f2", data = data, ssc = pf.ssc(adj = True, cluster_adj = False), vcov = "iid", iwls_tol = 1e-10)
fit_hetero = pf.fepois("Y ~ X1 + X2 | f1 + f2", data = data, ssc = pf.ssc(adj = True, cluster_adj = False), vcov = "hetero", iwls_tol = 1e-10)
fit_crv = pf.fepois("Y ~ X1 + X2 | f1 + f2", data = data, ssc = pf.ssc(adj = True, cluster_adj = False), vcov = {"CRV1":"f1"}, iwls_tol = 1e-10)

fit_r_iid = fixest.fepois(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov="iid",
    ssc=fixest.ssc(True, "none", False, "min", "min", False),
)

fit_r_hetero = fixest.fepois(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov="hetero",
    ssc=fixest.ssc(True, "none", False, "min", "min", False),
)

fit_r_crv = fixest.fepois(
    ro.Formula("Y ~ X1 + X2 | f1 + f2"),
    data=data,
    vcov=ro.Formula("~f1"),
    ssc=fixest.ssc(True, "none", False, "min", "min", False),
)

R[write to console]: NOTE: 3 observations removed because of NA values (LHS: 1, RHS: 1, Fixed-effects: 1).

R[write to console]: NOTE: 3 observations removed because of NA values (LHS: 1, RHS: 1, Fixed-effects: 1).

R[write to console]: NOTE: 3 observations removed because of NA values (LHS: 1, RHS: 1, Fixed-effects: 1).



In [74]:
fit_iid._vcov - stats.vcov(fit_r_iid)

array([[ 1.20791284e-08, -6.55604931e-10],
       [-6.55604931e-10,  1.69958097e-09]])

In [75]:
fit_hetero._vcov - stats.vcov(fit_r_hetero)

array([[ 2.17883089e-08, -7.37971037e-10],
       [-7.37971037e-10,  3.07279240e-09]])

In [76]:
fit_crv._vcov - stats.vcov(fit_r_crv)

array([[ 1.53194424e-08, -1.16909821e-10],
       [-1.16909821e-10,  3.07270399e-09]])

In [77]:
pf.etable([fit_iid, fit_hetero, fit_crv], digits = 6)

                              est1                  est2                  est3
------------  --------------------  --------------------  --------------------
depvar                           Y                     Y                     Y
------------------------------------------------------------------------------
X1            -0.006591 (0.040758)  -0.006591 (0.039125)  -0.006591 (0.034180)
X2            -0.014924 (0.010994)  -0.014924 (0.010496)  -0.014924 (0.010135)
------------------------------------------------------------------------------
f2                               x                     x                     x
f1                               x                     x                     x
------------------------------------------------------------------------------
R2                               -                     -                     -
S.E. type                      iid                hetero                by: f1
Observations                   997                  

In [78]:
pd.DataFrame(fixest.etable(fit_r_iid, fit_r_hetero, fit_r_crv, digits = 6)).T

Unnamed: 0,0,1,2,3
0,Dependent Var.:,Y,Y,Y
1,,,,
2,X1,-0.006591 (0.040758),-0.006591 (0.039125),-0.006591 (0.034180)
3,X2,-0.014924 (0.010994),-0.014924 (0.010496),-0.014924 (0.010135)
4,Fixed-Effects:,--------------------,--------------------,--------------------
5,f1,Yes,Yes,Yes
6,f2,Yes,Yes,Yes
7,_______________,____________________,____________________,____________________
8,S.E. type,IID,Heteroskedasti.-rob.,by: f1
9,Observations,997,997,997
