New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MSA FIPS code to geohi (csv naming schema), remove label_geography_cbsa.csv #66

Closed
srt1 opened this Issue Sep 1, 2017 · 19 comments

Comments

Projects
None yet
3 participants
@srt1
Collaborator

srt1 commented Sep 1, 2017

In section 6.2, the "geohi" can be the 5-digit FIPS code for the MSA.

@srt1 srt1 added this to the V4.2b-draft milestone Sep 1, 2017

@heathhayward

This comment has been minimized.

Show comment
Hide comment
@heathhayward

heathhayward Sep 1, 2017

Collaborator

The 5 digit FIPS should be added to https://lehd.ces.census.gov/data/schema/V4.2b-draft/label_fipsnum.csv. Is the logic that the metro-area files will have the 5-digit characters in them? https://lehd.ces.census.gov/data/schema/V4.2b-draft/naming_geohi.csv is described as containing alphabetic FIPS codes, not numeric, so the format or description of this file would need to change.

Collaborator

heathhayward commented Sep 1, 2017

The 5 digit FIPS should be added to https://lehd.ces.census.gov/data/schema/V4.2b-draft/label_fipsnum.csv. Is the logic that the metro-area files will have the 5-digit characters in them? https://lehd.ces.census.gov/data/schema/V4.2b-draft/naming_geohi.csv is described as containing alphabetic FIPS codes, not numeric, so the format or description of this file would need to change.

@srt1

This comment has been minimized.

Show comment
Hide comment
@srt1

srt1 Sep 1, 2017

Collaborator

There is one file per metro area, per data product (J2J, J2JR, J2JOD). These are collected in the /metro directory. Where it otherwise has "us" or state postal code, it will contain the 5-digit FIPS code. It will also have "sarhe", which is described elsewhere in the schema.

I defer to others on how to describe this, and at what point to implement it (4.2b or 4.2c). It's in the files I created for the release.

Collaborator

srt1 commented Sep 1, 2017

There is one file per metro area, per data product (J2J, J2JR, J2JOD). These are collected in the /metro directory. Where it otherwise has "us" or state postal code, it will contain the 5-digit FIPS code. It will also have "sarhe", which is described elsewhere in the schema.

I defer to others on how to describe this, and at what point to implement it (4.2b or 4.2c). It's in the files I created for the release.

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Sep 1, 2017

Member
Member

larsvilhuber commented Sep 1, 2017

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Sep 7, 2017

Member

What is the difference between label_geography_metro.csv and label_geography_cbsa.csv?

  • both are labeled as geo_level=B records.
  • The former only has metropolitan areas, the latter has both micro and metropolitan areas.
  • The former has state-balance areas listed ([ST]999), the latter does not
  • The latter has the title "Metropolitan Statistical Area" as part of the label, the latter does not

Please let me know which file is useful/being used. It looks like label_geography_cbsa.csv is a straight dump from geography, whereas label_geography_metro.csv is what is de facto used by J2J and the file naming convention.

  • Consider adding the "Metropolitan Statistical Area" part to label_geography_metro.csv.
Member

larsvilhuber commented Sep 7, 2017

What is the difference between label_geography_metro.csv and label_geography_cbsa.csv?

  • both are labeled as geo_level=B records.
  • The former only has metropolitan areas, the latter has both micro and metropolitan areas.
  • The former has state-balance areas listed ([ST]999), the latter does not
  • The latter has the title "Metropolitan Statistical Area" as part of the label, the latter does not

Please let me know which file is useful/being used. It looks like label_geography_cbsa.csv is a straight dump from geography, whereas label_geography_metro.csv is what is de facto used by J2J and the file naming convention.

  • Consider adding the "Metropolitan Statistical Area" part to label_geography_metro.csv.
@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Sep 7, 2017

Member

P.S. label_geography_cbsa.csv appeared for the first time in V4.1d-draft

Member

larsvilhuber commented Sep 7, 2017

P.S. label_geography_cbsa.csv appeared for the first time in V4.1d-draft

larsvilhuber added a commit that referenced this issue Sep 7, 2017

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Sep 7, 2017

Member

OK, difference between _metro and _cbsa:

  • CBSA is used in Shapefiles
  • Metro is used in J2J and (modified) in QWI

Reference to metro is in public_use_schema, reference to cbsa is in shapefiles.

Turning to geohi now...

Member

larsvilhuber commented Sep 7, 2017

OK, difference between _metro and _cbsa:

  • CBSA is used in Shapefiles
  • Metro is used in J2J and (modified) in QWI

Reference to metro is in public_use_schema, reference to cbsa is in shapefiles.

Turning to geohi now...

larsvilhuber added a commit that referenced this issue Sep 7, 2017

larsvilhuber added a commit that referenced this issue Sep 7, 2017

@larsvilhuber larsvilhuber removed their assignment Sep 7, 2017

@heathhayward

This comment has been minimized.

Show comment
Hide comment
@heathhayward

heathhayward Sep 7, 2017

Collaborator

I think we added the "_cbsa" file to the schema for J2J before we knew that J2J was going to be metro only. I vote to remove that file since the two iterations used in our data products are covered by the "metro" file ("B" for J2J) and the "[ST]" files ("M" for QWI). The "_cbsa" file is confusing and doesn't reflect anything in our data (that I know of). Lars are you ok with dropping it?

Collaborator

heathhayward commented Sep 7, 2017

I think we added the "_cbsa" file to the schema for J2J before we knew that J2J was going to be metro only. I vote to remove that file since the two iterations used in our data products are covered by the "metro" file ("B" for J2J) and the "[ST]" files ("M" for QWI). The "_cbsa" file is confusing and doesn't reflect anything in our data (that I know of). Lars are you ok with dropping it?

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Sep 7, 2017

Member
Member

larsvilhuber commented Sep 7, 2017

@heathhayward

This comment has been minimized.

Show comment
Hide comment
@heathhayward

heathhayward Sep 7, 2017

Collaborator

If we edit 5.2.2 and 5.2.3 to reference label_geography_metro.csv instead of label_geography_cbsa.csv then we can get rid of the _cbsa file from the schema. shazzaaaaamm

Collaborator

heathhayward commented Sep 7, 2017

If we edit 5.2.2 and 5.2.3 to reference label_geography_metro.csv instead of label_geography_cbsa.csv then we can get rid of the _cbsa file from the schema. shazzaaaaamm

@heathhayward

This comment has been minimized.

Show comment
Hide comment
@heathhayward

heathhayward Sep 7, 2017

Collaborator

we don't use it in the shapefiles

Collaborator

heathhayward commented Sep 7, 2017

we don't use it in the shapefiles

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Sep 7, 2017

Member
Member

larsvilhuber commented Sep 7, 2017

@heathhayward

This comment has been minimized.

Show comment
Hide comment
@heathhayward

heathhayward Sep 7, 2017

Collaborator

Can we not? This would mean that we would have labels like: "Not in metropolitan area, AL, Metropolitan Statistical Area".

Collaborator

heathhayward commented Sep 7, 2017

Can we not? This would mean that we would have labels like: "Not in metropolitan area, AL, Metropolitan Statistical Area".

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Sep 7, 2017

Member
Member

larsvilhuber commented Sep 7, 2017

@srt1

This comment has been minimized.

Show comment
Hide comment
@srt1

srt1 Sep 7, 2017

Collaborator

I didn't even realize that the naming_geohi.csv file was even a thing - I just thought it was a short description, forgetting that there was an underlying file containing all of the possibilities. So we seem to be beyond my familiarity with this part of the schema. The ticket's goals were accomplished as far as I had originally requested, and beyond. So I'm happy when you guys are happy.

Collaborator

srt1 commented Sep 7, 2017

I didn't even realize that the naming_geohi.csv file was even a thing - I just thought it was a short description, forgetting that there was an underlying file containing all of the possibilities. So we seem to be beyond my familiarity with this part of the schema. The ticket's goals were accomplished as far as I had originally requested, and beyond. So I'm happy when you guys are happy.

@heathhayward

This comment has been minimized.

Show comment
Hide comment
@heathhayward

heathhayward Sep 7, 2017

Collaborator

I vote for not adding them. Let's remove the label_geography_cbsa.csv file and call it a day

Collaborator

heathhayward commented Sep 7, 2017

I vote for not adding them. Let's remove the label_geography_cbsa.csv file and call it a day

@heathhayward

This comment has been minimized.

Show comment
Hide comment
@heathhayward

heathhayward Sep 7, 2017

Collaborator

the geo_level B defines the MSA label. Similar to how the geo_level in the label_geography.csv file defines the other geographies (i.e we don't have "County" or "State" included in the labels for other geographies). So I think we are being consistent.

Collaborator

heathhayward commented Sep 7, 2017

the geo_level B defines the MSA label. Similar to how the geo_level in the label_geography.csv file defines the other geographies (i.e we don't have "County" or "State" included in the labels for other geographies). So I think we are being consistent.

larsvilhuber added a commit that referenced this issue Sep 7, 2017

larsvilhuber added a commit that referenced this issue Sep 7, 2017

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Dec 15, 2017

Member

Ready to implement in 4.2

Member

larsvilhuber commented Dec 15, 2017

Ready to implement in 4.2

@larsvilhuber larsvilhuber reopened this Dec 15, 2017

@larsvilhuber larsvilhuber changed the title from Add MSA FIPS code to geohi (csv naming schema) to Add MSA FIPS code to geohi (csv naming schema), remove label_geography_cbsa.csv Dec 18, 2017

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Dec 18, 2017

Member

@heathhayward @srt1 :
This had two components: naming of files (geohi) and presence of geography_cbsa (to be removed). We never implemented the first component.

Questions:

  • what is the universe of CBSA codes in 'geohi'? Are the 01999... allowed, or only the 10180... kind of codes?
  • this affects naming_geohi.csv
Member

larsvilhuber commented Dec 18, 2017

@heathhayward @srt1 :
This had two components: naming of files (geohi) and presence of geography_cbsa (to be removed). We never implemented the first component.

Questions:

  • what is the universe of CBSA codes in 'geohi'? Are the 01999... allowed, or only the 10180... kind of codes?
  • this affects naming_geohi.csv
@heathhayward

This comment has been minimized.

Show comment
Hide comment
@heathhayward

heathhayward Dec 18, 2017

Collaborator

The universe of MSA codes in 'geohi' does and should include the '01999' state remainders. This matches with the filenames Stephen creates in https://lehd.ces.census.gov/data/j2j/R2017Q3/j2j/metro/, for example. So the naming_geohi.csv file that we've got on 4.2b-draft looks correct to me (https://lehd.ces.census.gov/data/schema/V4.2b-draft/naming_geohi.csv).

Bigger picture, the geography schema page is correct and complete from my perspective. The only thing we might want to consider changing is to include all of the 'label_geography_metro' rows in the 'label_geography.csv' file. In https://lehd.ces.census.gov/data/schema/V4.2b-draft/lehd_public_use_schema.html#_a_id_geography_a_geography, we do mention that this is a "a composite file containing all geocodes", so shouldn't we include the "B" geo_level" geographies here?

Collaborator

heathhayward commented Dec 18, 2017

The universe of MSA codes in 'geohi' does and should include the '01999' state remainders. This matches with the filenames Stephen creates in https://lehd.ces.census.gov/data/j2j/R2017Q3/j2j/metro/, for example. So the naming_geohi.csv file that we've got on 4.2b-draft looks correct to me (https://lehd.ces.census.gov/data/schema/V4.2b-draft/naming_geohi.csv).

Bigger picture, the geography schema page is correct and complete from my perspective. The only thing we might want to consider changing is to include all of the 'label_geography_metro' rows in the 'label_geography.csv' file. In https://lehd.ces.census.gov/data/schema/V4.2b-draft/lehd_public_use_schema.html#_a_id_geography_a_geography, we do mention that this is a "a composite file containing all geocodes", so shouldn't we include the "B" geo_level" geographies here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment