WIP: Summary2 #582

wants to merge 20 commits into


None yet

4 participants


Cleaned up version of what I had posted but deleted a few days ago. It now works as advertised, but some of it could certainly benefit from a more experienced python hand.

Let me know if any of this is usable.


@josef-pkt josef-pkt and 1 other commented on an outdated diff Nov 26, 2012
+ # Model info
+ model_info = summary_model(self)
+ model_info['Method:'] = self.model.method
+ model_info['Sample:'] = sample[0]
+ model_info['S.D. of innovations:'] = "%#5.3f" % self.sigma2**.5
+ model_info['HQIC:'] = "%#5.3f" % self.hqic
+ model_info['No. Observations:'] = str(len(self.model.endog))
josef-pkt Nov 26, 2012 Member

what determines the sequence in which this item show up in the table? and whether the info is on left or right?

Is it still possible to do 'Sample' in two lines ?

vincentarelbundock Nov 27, 2012 Member

The order in which elements were entered in the OrderedDict determines where they fall in the table. We split left/right automatically in the middle of the dict (with white space on the right side if the number of elements is odd). You can tweak the order by rearranging the summary_model() function, or by sorting the resulting OrderedDict. In the current setup, arima-specific fields are just appended at the end of the basic information fields shared by other models.

Alternatively, you could sacrifice convenience for control by creating a Numpy array or DataFrame with 4 columns (labels in cols 1 and 3) and then use the Summary().add_df or add_array.


updated the example notebook: http://nbviewer.ipython.org/4124662/


It would be nice if summary_col could produce nice looking html tables in ipython notebook like pandas DataFrames (see pandas-dev/pandas#772). Also the new just model.summary() in a notebook looks like print(...)/< pre >..< /pre > now but the old way way a nice html table.


The summary_col(output) method fails for me:

Exception                                 Traceback (most recent call last)
<ipython-input-48-249bb855cece> in <module>()
----> 1 print summary_col(output)

C:\portabel\Python27\lib\site-packages\statsmodels\iolib\summary.py in summary_col(results, float_format, model_names, stars, info_dict)
    460     cols = [_col_info(x, info_dict) for x in results]
    461     merg = lambda x,y: x.merge(y, how='outer', right_index=True, left_index=True)
--> 462     info = reduce(merg, cols)
    463     # Summary
    464     smry = Summary()

C:\portabel\Python27\lib\site-packages\statsmodels\iolib\summary.py in <lambda>(x, y)
    459     # Info as dataframe columns
    460     cols = [_col_info(x, info_dict) for x in results]
--> 461     merg = lambda x,y: x.merge(y, how='outer', right_index=True, left_index=True)
    462     info = reduce(merg, cols)
    463     # Summary

C:\portabel\Python27\lib\site-packages\pandas\core\frame.pyc in merge(self, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy)
   4362                      left_on=left_on, right_on=right_on,
   4363                      left_index=left_index, right_index=right_index, sort=sort,
-> 4364                      suffixes=suffixes, copy=copy)
   4366     #----------------------------------------------------------------------

C:\portabel\Python27\lib\site-packages\pandas\tools\merge.pyc in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy)
     34                          right_index=right_index, sort=sort, suffixes=suffixes,
     35                          copy=copy)
---> 36     return op.get_result()
     37 if __debug__: merge.__doc__ = _merge_doc % '\nleft : DataFrame'

C:\portabel\Python27\lib\site-packages\pandas\tools\merge.pyc in get_result(self)
    186         # this is a bit kludgy
--> 187         ldata, rdata = self._get_merge_data()
    189         # TODO: more efficiently handle group keys to avoid extra

C:\portabel\Python27\lib\site-packages\pandas\tools\merge.pyc in _get_merge_data(self)
    276         lsuf, rsuf = self.suffixes
    277         ldata, rdata = ldata._maybe_rename_join(rdata, lsuf, rsuf,
--> 278                                                 copydata=False)
    279         return ldata, rdata

C:\portabel\Python27\lib\site-packages\pandas\core\internals.pyc in _maybe_rename_join(self, other, lsuffix, rsuffix, copydata)
   1175     def _maybe_rename_join(self, other, lsuffix, rsuffix, copydata=True):
-> 1176         to_rename = self.items.intersection(other.items)
   1177         if len(to_rename) > 0:
   1178             if not lsuffix and not rsuffix:

C:\portabel\Python27\lib\site-packages\pandas\core\index.pyc in intersection(self, other)
    653             this = self.astype('O')
    654             other = other.astype('O')
--> 655             return this.intersection(other)
    657         if self.is_monotonic and other.is_monotonic:

C:\portabel\Python27\lib\site-packages\pandas\core\index.pyc in intersection(self, other)
    662                 pass
--> 664         indexer = self.get_indexer(other.values)
    665         indexer = indexer.take((indexer != -1).nonzero()[0])
    666         return self.take(indexer)

C:\portabel\Python27\lib\site-packages\pandas\core\index.pyc in get_indexer(self, target, method, limit)
    789             this = self.astype(object)
    790             target = target.astype(object)
--> 791             return this.get_indexer(target, method=method, limit=limit)
    793         if not self.is_unique:

C:\portabel\Python27\lib\site-packages\pandas\core\index.pyc in get_indexer(self, target, method, limit)
    793         if not self.is_unique:
--> 794             raise Exception('Reindexing only valid with uniquely valued Index '
    795                             'objects')

Exception: Reindexing only valid with uniquely valued Index objects

It seems to come from the missing column name: adding res.columns = [str(result.model.endog_names)] before return in _col_params(result, float_format='%.4f', stars=True) and out.columns = [str(result.model.endog_names)] in _col_info(result, info_dict=None) made the error go away.


long "model names" (see comment above) make the "R2/AIC/N" part of the table unaligned with the rest of the values:

              np.log(avg_blub + 1)  np.log(blublu + 1)  
Intercept     2.7458***              1.3378***          
              (0.0165)               (0.0071)           
BLUBBEREEE    0.1779***              0.1763***          
              (0.0130)               (0.0056)           
R2                    0.053                   0.229     
AIC                   116081.787              55120.079 
N                     36308                   36308     
Standard errors in parentheses.
* p<.1, ** p<.05, ***p<.01

[shortend: i actually had more models (and valiables) included]

BUT it works! Thanks a lot!


possible enhancement for comparing models table: add results from a t-test

What looks nice in the econometrics text book by Stock Watson is that terms that have interaction effects or polynomial (linear and square terms for example) are included in the table as separate row that shows what the total effect of a variable is and whether it is significantly different from zero.

It should be possible to get everything from the ttest method.


@josef-pkt I don't have the book with me. Could this be done simply by adding new rows like the N, R2? If so, then we can just write a simple helper function to create a dict with the required information and then feed that to the info_dict argument of summary_col(). I think I'd rather nail down the basic functionality before moving ahead with this though.


Yes, either adding new rows like the N, R2, or adding to the parameter table. (I haven't looked at the details yet.)

This can also wait, just something from the wishlist to keep in mind.

@vincentarelbundock vincentarelbundock referenced this pull request Jan 30, 2013

Summary re-write #636

2 of 6 tasks complete

Work moved to #636


Coverage Status

Changes Unknown when pulling 2f9000d on vincentarelbundock:summary2 into * on statsmodels:master*.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment