FAQ Having trouble getting Exogenous names in model summaries #5492

emilmirzayev · 2019-02-12T14:49:58Z

Hi. I am using using statsmodels installed with Anaconda with following versions:


>>> statsmodels.__version__
'0.9.0'
>>> exit()

(base) C:\Users\emirzayev>conda --version
conda 4.6.2

Now when I fit a model, in summary table, I do not see the names of the variables. Only x1, x2, xN. Is there a way to have the variable names also in the summary? Or this change is permanent.

                           Logit Regression Results                           
==============================================================================
Dep. Variable:                 choice   No. Observations:                 1766
Model:                          Logit   Df Residuals:                     1757
Method:                           MLE   Df Model:                            8
Date:                Tue, 12 Feb 2019   Pseudo R-squ.:                  -2.762
Time:                        14:13:09   Log-Likelihood:                -1214.0
converged:                       True   LL-Null:                       -322.66
                                        LLR p-value:                     1.000
==============================================================================
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
x1             0.0751      0.070      1.073      0.283      -0.062       0.212
x2            -0.0944      0.109     -0.866      0.386      -0.308       0.119
x3            -0.1273      0.103     -1.233      0.218      -0.330       0.075
x4             0.0641      0.050      1.273      0.203      -0.035       0.163
x5             0.0682      0.054      1.266      0.205      -0.037       0.174
x6             0.1070      0.122      0.880      0.379      -0.131       0.345
x7            -0.0595      0.076     -0.778      0.437      -0.209       0.090
x8             0.1410      0.143      0.987      0.324      -0.139       0.421
x9             0.1769      0.093      1.896      0.058      -0.006       0.360
==============================================================================

Thanks beforehand

The text was updated successfully, but these errors were encountered:

josef-pkt · 2019-02-12T14:59:55Z

Are you using numpy arrays for endog and exog in Logit?

numpy arrays don't hold names of variables/columns, so the param or exog names are just made up.

Using pandas DataFrames for exog or using formulas preserves the names, and uses it in the summary.

There is a way to set the names but that still does not have a very clean API.
If you have your own xnames, then
model.exog_names[:] = xnames
Note this is inplace modification not assigment.

just for summary:
summary has an xname keyword that allows overriding the parameter/exog names. That will not change any attributes and is only used for creating the summary table.

xnames needs to be a list of strings with same length as params

emilmirzayev · 2019-02-12T15:15:29Z

Yes, I am using NumPy arrays for this. The reason that I noticed it now, I used fit_trasform on the data now. I will try your method and write my feedback here. Waiting for the code to be executed first

emilmirzayev · 2019-02-12T15:56:12Z

@josef-pkt , I tried to assign new names after the initialization, as

model_ap_simple = sm.Logit(y_train, X_train)

model_ap_simple.exog_names[:] = exog_variables_simple
model_ap_simple.fit()
Optimization terminated successfully.
         Current function value: 0.688823
         Iterations 5
<statsmodels.discrete.discrete_model.BinaryResultsWrapper at 0x213ce544080>

with open("AffinityPropagationSimpleModel.txt", "w") as file:
    file.write(str(model_ap_simple.summary()))
Traceback (most recent call last):

  File "<ipython-input-141-efe57b8b3497>", line 2, in <module>
    file.write(str(model_ap_simple.summary()))

AttributeError: 'Logit' object has no attribute 'summary'

and also after init-fit phase

    model_ap_simple.exog_names[:] = exog_variables_simple

  File "C:\Users\emirzayev\AppData\Local\Continuum\anaconda3\lib\site-packages\statsmodels\base\wrapper.py", line 35, in __getattribute__
    obj = getattr(results, attr)

AttributeError: 'LogitResults' object has no attribute 'exog_names'

In both cases I got error.
I am probably doing something wrong. Would appreciate any help

josef-pkt · 2019-02-12T16:10:18Z

AFAICS, you are mixing up model and results instance (difference to sklearn)

model_ap_simple.fit()
returns a results instance and does not change in general the model instance model_ap_simple

In the second case to try to access exog_names in the results instance and not the model instance

try this:

model_ap_simple = sm.Logit(y_train, X_train)
model_ap_simple.exog_names[:] = exog_variables_simple

results_ap_simple = model_ap_simple.fit()
print(results_ap_simple.summary()

emilmirzayev · 2019-02-12T18:42:45Z

It did work! thank you for answering on such short notice

stevenlis · 2019-02-12T21:24:22Z

@josef-pkt I'm always curious about this. Is there any reason why Date and Time always show in a summary table? Is there any way to hide them?

josef-pkt · 2019-02-13T14:27:33Z

@StevenLi-DS

It's just so we know when we estimated the model (or better when we printed the summary *).
You are the first to ask for hiding/removing it.
A few weeks ago I saw a twitter comment from someone who was happy to have the date and time.

When I wrote the first implementation of summary, I just browsed through what several statistics and econometrics programs (especially Stata) were showing in the summary, and added what I thought looks useful and "traditional".

There is currently no option to adjust what is in the summary, what is included is hardcoded for each model. summary2 is more flexible than summary but I never looked whether we can make results statistics optional when using summary2.

(*) We have an issue to show the fit time instead of the summary time, but I haven't convinced myself yet that we want to call time in fit, i.e. do this extra work in all fit methods.

stevenlis · 2019-02-13T15:22:47Z

I can't think about a useful case so far than exposing I got nothing to do than stats at late night. lol...😂

stevenlis · 2019-02-24T21:18:29Z

I just realized that I could actually modify the statsmodels.regression.linear_model.OLS in place? Shouldn't this be prevented?

model = smf.ols(formula, data=df)
results = model.fit()
exogs_list = model.exog_names
exogs_list.remove('Intercept')
endog_name = model.endog_names
# call summary will raise error
results.summary()

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
 in 
      7 endog_name = model.endog_names
      8 
----> 9 results.summary()

~\Anaconda3\lib\site-packages\statsmodels\regression\linear_model.py in summary(self, yname, xname, title, alpha)
   2406                              yname=yname, xname=xname, title=title)
   2407         smry.add_table_params(self, yname=yname, xname=xname, alpha=alpha,
-> 2408                               use_t=self.use_t)
   2409 
   2410         smry.add_table_2cols(self, gleft=diagn_left, gright=diagn_right,

~\Anaconda3\lib\site-packages\statsmodels\iolib\summary.py in add_table_params(self, res, yname, xname, alpha, use_t)
    862         if res.params.ndim == 1:
    863             table = summary_params(res, yname=yname, xname=xname, alpha=alpha,
--> 864                                    use_t=use_t)
    865         elif res.params.ndim == 2:
    866 #            _, table = summary_params_2dflat(res, yname=yname, xname=xname,

~\Anaconda3\lib\site-packages\statsmodels\iolib\summary.py in summary_params(results, yname, xname, alpha, use_t, skip_header, title)
    465 
    466     if len(xname) != len(params):
--> 467         raise ValueError('xnames and params do not have the same length')
    468 
    469     params_stubs = xname

ValueError: xnames and params do not have the same length

emilmirzayev · 2019-02-25T08:21:38Z

@StevenLi-DS ,I think you are right. By modifying maybe only namechange should be allowed?
Because, deleting some variable ex-post by only deleting its name from exog_names should not be possible

stevenlis · 2019-02-27T19:13:22Z

I think maybe it should just return a copy of the exog names

kshedden · 2019-02-27T19:48:24Z

In general I don't think we have ever tried to prevent people from changing attributes of model or results classes. There are some "cache readonly" attributes in the results classes that cannot be changed, but this is an indirect effect of their being cached so that they are not repeatedly recomputed.

josef-pkt · 2019-02-27T20:19:10Z

To emphasize Kerby's comment.

In general the user should not change any attributes of either model or results. There is a kind of exception for changing exog_names as in my example above because we don't have an official interface for changing it.

However, not changing attributes is not enforced. If the user changes attributes, then it is on her/his own risk.
One reason that we cannot enforce it, is that we are using those backdoors (changing attributes during execution) internally, and there is no easy way to prevent users to do it if we want to do it on our own.
Also, for expert usage, e.g. when I am writing a new model prototype, I often just manipulate the attributes of a model or results instance. This is fragile and not safe, but it allow for fast writing of experimental code, and can then be safeguarded by unit test or rewritten for a cleaner version.

stevenlis · 2019-02-27T22:01:16Z

@kshedden @josef-pkt Thanks for the explanation. Maybe it should be written in the doc to warn the users.

josef-pkt added comp-base comp-io FAQ labels Feb 12, 2019

josef-pkt changed the title ~~[Question] Having trouble getting Exogenous names in model summaries~~ FAQ Having trouble getting Exogenous names in model summaries Feb 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ Having trouble getting Exogenous names in model summaries #5492

FAQ Having trouble getting Exogenous names in model summaries #5492

emilmirzayev commented Feb 12, 2019

josef-pkt commented Feb 12, 2019 •

edited

emilmirzayev commented Feb 12, 2019

emilmirzayev commented Feb 12, 2019 •

edited

josef-pkt commented Feb 12, 2019 •

edited

emilmirzayev commented Feb 12, 2019

stevenlis commented Feb 12, 2019

josef-pkt commented Feb 13, 2019

stevenlis commented Feb 13, 2019

stevenlis commented Feb 24, 2019

emilmirzayev commented Feb 25, 2019

stevenlis commented Feb 27, 2019

kshedden commented Feb 27, 2019

josef-pkt commented Feb 27, 2019

stevenlis commented Feb 27, 2019

FAQ Having trouble getting Exogenous names in model summaries #5492

FAQ Having trouble getting Exogenous names in model summaries #5492

Comments

emilmirzayev commented Feb 12, 2019

josef-pkt commented Feb 12, 2019 • edited

emilmirzayev commented Feb 12, 2019

emilmirzayev commented Feb 12, 2019 • edited

josef-pkt commented Feb 12, 2019 • edited

emilmirzayev commented Feb 12, 2019

stevenlis commented Feb 12, 2019

josef-pkt commented Feb 13, 2019

stevenlis commented Feb 13, 2019

stevenlis commented Feb 24, 2019

emilmirzayev commented Feb 25, 2019

stevenlis commented Feb 27, 2019

kshedden commented Feb 27, 2019

josef-pkt commented Feb 27, 2019

stevenlis commented Feb 27, 2019

josef-pkt commented Feb 12, 2019 •

edited

emilmirzayev commented Feb 12, 2019 •

edited

josef-pkt commented Feb 12, 2019 •

edited