Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong core sample selection? #8

Closed
nilshg opened this issue May 13, 2015 · 6 comments
Closed

Wrong core sample selection? #8

nilshg opened this issue May 13, 2015 · 6 comments
Labels

Comments

@nilshg
Copy link
Contributor

nilshg commented May 13, 2015

Either I'm misunderstanding the PSID documentation, or there's a mistake in how the core sample is selected. Check out line 341:

yind <- copy(yind[ER30001>2930])    # individuals 1-2930 are from poor sample

This drops all individuals with a 1968 family interview number 1-2929. But, from the PSID FAQs:

How can I identify the SEO (Survey of Economic Opportunity) sample and the SRC (Survey Research Center) sample?

You will need to look at the 1968 family interview number available in the individual-level files (V30001 and ER30001 in 2007).
SRC sample families have values less than 3000.
SEO sample families have values greater than 5000 and less than 7000.

This is also consistent with a dataset from a paper I'm currently replicating, which uses families 1-2930 as the core.

So shouldn't the line read:

yind <- copy(yind[ER30001<=2930])   # individuals 1-2930 are from poor sample
@floswald
Copy link
Owner

Looks like a bug. Is that so in all years? Can you send the link to that
faq please? Thanks

On Wednesday, 13 May 2015, Nils notifications@github.com wrote:

Either I'm misunderstanding the PSID documentation, or there's a mistake
in how the core sample is selected. Check out line 341
https://github.com/floswald/psidR/blob/master/R/build.panel.r#L341:

yind <- copy(yind[ER30001>2930]) # individuals 1-2930 are from poor sample

This drops all individuals with a 1968 family interview number 1-2929.
But, from the PSID FAQs:

How can I identify the SEO (Survey of Economic Opportunity) sample and the
SRC (Survey Research Center) sample?

You will need to look at the 1968 family interview number available in the
individual-level files (V30001 and ER30001 in 2007).
SRC sample families have values less than 3000.
SEO sample families have values greater than 5000 and less than 7000.

This is also consistent with a dataset from a paper I'm currently
replicating, which uses families 1-2930 as the core.

So shouldn't the line read:

yind <- copy(yind[ER30001<=2930]) # individuals 1-2930 are from poor sample


Reply to this email directly or view it on GitHub
#8.

@nilshg
Copy link
Contributor Author

nilshg commented May 13, 2015

Sorry, copy-paste failed above, link to FAQs now fixed.
Yes, I believe this is the case in all years - this line is inside the year-loop, so you're selecting the SEO rather than the core sample every year.

@floswald floswald added the bug label May 13, 2015
@floswald
Copy link
Owner

ok i fixed that in 0ed96f7. i never used this functionality (always used both samples), but really should have tested this better. I do apologize. I'll also add the other selection criteria later on (latino, etc).

@floswald
Copy link
Owner

i changed the core argument to sample in a629696, which takes a string now. Testing is greatly improved. thanks again for looking at the code!

@nilshg
Copy link
Contributor Author

nilshg commented May 14, 2015

That's great, thanks! Are these changes pushed to CRAN as well? I'm having trouble installing RTools on my work machine without admin privileges, so it would be great to be able to downloaded the fixed version from there.

@floswald
Copy link
Owner

Windows binary built now good to go.

On Friday, 15 May 2015, Florian Oswald florian.oswald@gmail.com wrote:

On cran now but binaries are not built yet it seems. Check its version
1.3 after installing!

Cheers

On Thursday, 14 May 2015, Nils <notifications@github.com
javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

That's great, thanks! Are these changes pushed to CRAN as well? I'm
having trouble installing RTools on my work machine without admin
privileges, so it would be great to be able to downloaded the fixed version
from there.


Reply to this email directly or view it on GitHub
#8 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants