enh: fractional Logit, Probit #2040

Closed
josef-pkt opened this Issue Oct 10, 2014 · 4 comments

Projects

None yet

3 participants

@josef-pkt
Member

http://onlinelibrary.wiley.com/doi/10.1002/%28SICI%291099-1255%28199611%2911:6%3C619::AID-JAE418%3E3.0.CO;2-1/abstract

we need to replace the check for set (0, 1) with interval [0, 1]

partial skimming of the paper only:

uses QuasiMLE with Logit/Bernoulli and sandwich covariance for robust.
this means Logit with cov_type="white" or "H0" should already do everything.

They have a 2008 paper that uses Probit. in panel setting
http://www.sciencedirect.com/science/article/pii/S030440760800050X

the first article has 1417 citations in Google so there is a lot more similar.

@josef-pkt josef-pkt added this to the 0.6 milestone Oct 10, 2014
@josef-pkt
Member
"
Interestingly, the robust standard errors from equation (9) in the context of ordinary logit
and probit are computed almost routinely by certain statistics and econometrics packages,
such as STATA@ and SST@. Unfortunately, the packages with which we are familiar
automatically transform the dependent variable used in logit or probit into a binary variable
before estimation, or do not allow non-binary variables at all (STATA@ and SST@ fall into
the first category). With the minor change of allowing for fractional y in so-called binary
response analysis, standard software packages could be used to estimate the parameters in
equation (4) and to perform asymptotically valid inference. Alternatively, programming the
estimator in a language such as GAUSS@, as we do for our application in Section 4, is fairly
straightforward.
"

Papke and Wooldridge 1996

GLM in Stata has been updated since then
http://www.stata.com/support/faqs/statistics/logit-transformation/

@josef-pkt
Member

@kshedden Does GEE work with fractional data in a logit or probit binary/binomial model?

I haven't looked at the example yet, it has panel data for a response variable that is a proportion

in the Papke Wooldrige 2008 article they use just GLM and GEE (among others)

for example

glm math4 lavgrexp alavgrexp lunch alunch lenroll alenroll y96-y01 if year>1994, fa(bin) link(probit) cluster(distid)
mat b = e(b)
xtgee math4 lavgrexp lunch lenroll alavgrexp alunch alenroll y96-y01, fa(bi) link(probit) corr(exch) robust from(b,skip)

Papke has the replication files on her (departmental) web page (her personal page gives me a permission error) http://www.econ.msu.edu/faculty/papke/

@kshedden
Contributor

It seems to work fine, I made a notebook:

http://nbviewer.ipython.org/urls/umich.box.com/shared/static/y0azjuau3t21b7p11m56.ipynb

On Sat, Oct 11, 2014 at 9:19 AM, Josef Perktold notifications@github.com
wrote:

@kshedden https://github.com/kshedden Does GEE work with fractional
data in a logit or probit binary/binomial model?

I haven't looked at the example yet, it has panel data for a response
variable that is a proportion

in the Papke Wooldrige 2008 article they use just GLM and GEE (among
others)

for example

glm math4 lavgrexp alavgrexp lunch alunch lenroll alenroll y96-y01 if year>1994, fa(bin) link(probit) cluster(distid)
mat b = e(b)
xtgee math4 lavgrexp lunch lenroll alavgrexp alunch alenroll y96-y01, fa(bi) link(probit) corr(exch) robust from(b,skip)

Papke has the replication files on her (departmental) web page (her
personal page gives me a permission error)
http://www.econ.msu.edu/faculty/papke/


Reply to this email directly or view it on GitHub
#2040 (comment)
.

@jseabold
Member

PR?

@jseabold jseabold added a commit to jseabold/statsmodels that referenced this issue Oct 15, 2014
@jseabold jseabold ENH: Allow unit interval for binary models. Closes #2040. 3a6fd4e
@jseabold jseabold closed this in #2044 Oct 15, 2014
@yarikoptic yarikoptic added a commit to yarikoptic/statsmodels that referenced this issue Oct 23, 2014
@yarikoptic yarikoptic Merge tag 'v0.6.0rc1' into debian
Version 0.6.0 Release Candidate 1

* tag 'v0.6.0rc1': (58 commits)
  RLS: Set version number to 0.6.0rc1
  DOC: Fix docstrings.
  DOC: Add release notes to index
  DOC: Fix bullet list.
  DOC: Fix directive.
  DOC: Cleanup docstrings.
  DOC: Use formula namespace
  DOC: Use get_rdataset.
  DOC: Make more pythonic
  DOC: Rename file
  DOC: Just use github for roadmap
  MAINT: Update patsy min version
  DOC: Updates to release notes.
  DOC: Use idiomatic statsmodels.
  MAINT: Update release notes with gh-stats
  MAINT: Deal with unicode.
  MAINT: Workaround for existing OAuth not in keyring.
  SQUASHME: Update release notes.
  MAINT: Update mailmap.
  ENH: Allow unit interval for binary models. Closes #2040.
  ...
21f6bc3
@yarikoptic yarikoptic added a commit to yarikoptic/statsmodels that referenced this issue Nov 17, 2014
@yarikoptic yarikoptic Merge tag 'v0.6.0rc1' into releases
Version 0.6.0 Release Candidate 1

* tag 'v0.6.0rc1': (295 commits)
  RLS: Set version number to 0.6.0rc1
  DOC: Fix docstrings.
  DOC: Add release notes to index
  DOC: Fix bullet list.
  DOC: Fix directive.
  DOC: Cleanup docstrings.
  DOC: Use formula namespace
  DOC: Use get_rdataset.
  DOC: Make more pythonic
  DOC: Rename file
  DOC: Just use github for roadmap
  MAINT: Update patsy min version
  DOC: Updates to release notes.
  DOC: Use idiomatic statsmodels.
  MAINT: Update release notes with gh-stats
  MAINT: Deal with unicode.
  MAINT: Workaround for existing OAuth not in keyring.
  SQUASHME: Update release notes.
  MAINT: Update mailmap.
  ENH: Allow unit interval for binary models. Closes #2040.
  ...
90cbfc3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment