Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: update stata for Stata 13 format #4291

Closed
jreback opened this issue Jul 18, 2013 · 6 comments · Fixed by #4662

Comments

@jreback
Copy link
Contributor

commented Jul 18, 2013

@PKEuS

This comment has been minimized.

Copy link
Contributor

commented Jul 31, 2013

New Stata format is obviously inspired by XML, although it doesn't fully behave like it: The "marker pairs" (which are tags in XML terminology) have to appear in a fixed order, so they are basically a help for easier reading through raw stata files.Thus, binary reading just like for old formats should still work. A XML reader, however, might make reading those files easier and more flexible (in case that they someday start to really implement it as XML).
Do you prefer an implementation based on the XML approach (for example using the xml.dom.minidom module of python) or the binary approach?

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Jul 31, 2013

lxml is on pandas list of optional deps, so you can use that. but isn't the binary similar to the existing?

so that would be easier?

@PKEuS

This comment has been minimized.

Copy link
Contributor

commented Jul 31, 2013

inside the tags, most stuff should be the same as before. Since there is a map with seek positions for all content tags, the parser could rely on that. I think, that I'll try the binary attempt.

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Jul 31, 2013

great! leave all of the existing data files alone and add a new one for (for version 13). to ensure back compat. thanks!

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Jul 31, 2013

The statsmodels repo has an issue about this too. I don't know if @jseabold or anyone else has done anything with it.

@jseabold

This comment has been minimized.

Copy link
Contributor

commented Jul 31, 2013

I haven't yet. If I do get to it (unlikely right now), I'll make a PR against pandas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.