Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a reference mapping the sample name code to the sample dataset name? #53

Closed
hessakh opened this issue Sep 26, 2023 · 6 comments

Comments

@hessakh
Copy link

hessakh commented Sep 26, 2023

Is there a reference mapping the sample name code to the sample dataset name? I don't see it in the vignette. Thank you!

@robe2037
Copy link
Collaborator

Hi @hessakh, if you're referring to a mapping between the sample codes used by the IPUMS API and the more human-readable names of those samples, you can use get_sample_info(). Sample codes and names are available for the currently supported microdata collections (“usa”, “cps”, and “ipumsi”). See the Microdata API Requests vignette for an example.

The IPUMS API doesn’t yet support more extensive metadata access for microdata collections, but this functionality is slated to be added in the future.

@hessakh
Copy link
Author

hessakh commented Sep 26, 2023

Thank you! Do you by any chance know where the ACS5 2021 sample falls under?

@robe2037
Copy link
Collaborator

For 2021 ACS 5-year microdata, you'll find the sample in IPUMS USA. Take a look at the IPUMS USA website for more details about the data it provides.

If you need additional help identifying data for your needs, check out the IPUMS Forum.

@hessakh
Copy link
Author

hessakh commented Sep 26, 2023 via email

@robe2037
Copy link
Collaborator

robe2037 commented Sep 26, 2023

As mentioned above, you can view a listing of all sample codes for IPUMS USA by using get_sample_info(). For instance:

samples <- get_sample_info("usa")

samples
#> # A tibble: 150 × 2
#>   name    description                             
#>   <chr>   <chr>                                   
#> 1 us1850a 1850 1%                                 
#> 2 us1850b 1850 100% sample (July 2015)            
#> 3 us1850c 1850 100% sample (Revised December 2017)
#> 4 us1860a 1860 1%                                 
#> 5 us1860b 1860 1% sample with black oversample    
#> 6 us1860c 1860 100% sample (Jan 2019)             
#> 7 us1870a 1870 1%                                 
#> 8 us1870b 1870 1% sample with black oversample    
#> 9 us1870c 1870 100% sample (Jan 2019)             
#> 10 us1880a 1880 1%                                 
#> # ℹ 140 more rows
#> # ℹ Use `print(n = ...)` to see more rows

The name column contains the code that you would use in define_extract_usa(). The description column gives a short description of the sample associated with the indicated code.

If you're looking for 2021 ACS data, look for a description that includes those keywords. You can do this manually or by filtering through the table:

samples[grepl("2021 ACS", samples$description), ]
#> # A tibble: 1 × 2
#>  name    description
#>  <chr>   <chr>      
#> 1 us2021a 2021 ACS   

So, you would use "us2021a" in the samples field of define_extract_usa().

Note that this assumes you have set up your API key and are registered for IPUMS USA. For more about setting up your API key, see the corresponding section in the Introduction to the API.

@hessakh
Copy link
Author

hessakh commented Sep 26, 2023

Ok I see. I used print(samples[grepl("ACS 5-year", samples$description), ]) to get the ACS 5 2021 which is different from the ACS 2021. And got the following:

`# A tibble: 13 × 2
   name    description          
   <chr>   <chr>                
 1 us2009e 2005-2009, ACS 5-year
 2 us2010e 2006-2010, ACS 5-year
 3 us2011e 2007-2011, ACS 5-year
 4 us2012e 2008-2012, ACS 5-year
 5 us2013e 2009-2013, ACS 5-year
 6 us2014c 2010-2014, ACS 5-year
 7 us2015c 2011-2015, ACS 5-year
 8 us2016c 2012-2016, ACS 5-year
 9 us2017c 2013-2017, ACS 5-year
10 us2018c 2014-2018, ACS 5-year
11 us2019c 2015-2019, ACS 5-year
12 us2020c 2016-2020, ACS 5-year
13 us2021c 2017-2021, ACS 5-year`

so the code name I was looking for is 'us2021c' for 5-year ACS 2021.

@hessakh hessakh closed this as completed Sep 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants