New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconcile ownercode/agg_level, add agg_level to QWI csv files #70

Open
srt1 opened this Issue Sep 12, 2017 · 10 comments

Comments

Projects
3 participants
@srt1
Collaborator

srt1 commented Sep 12, 2017

We would like to add agg_level to the QWI files, but we need to resolve some complications related to the ownercode variable.

One way we could approach it without adding any additional variables might be the folllowing (and we've spoken around this concept in the past, so just revisiting it a bit here):

  • J2J (counts) tabs are NOT by ownership, so state totals are agg_level=1025 (internal, margin=2048), ownership=A00

  • QWI state totals are NOT by ownership, so same

  • QWI private only ARE by ownership, so state totals are agg_level=1057 (internal, margin=2080), ownership=A05. Implicitly, there is a residual for (A00 minus A05).

  • A standalone federal tabulation could be considered either by ownership or not - so the state totals would be either 1025 or 1057, depending on how you want to characterize it. The ownership variable should be B01, to indicate the universe is the set of OPM jobs.

  • A consolidated UI+OPM QWI may be as follows:

    • Grand total for state, agg_level=1025, ownership=C00
    • Federal total: agg_level=1057, ownership=C01
    • Private total: agg_level=1057, ownership=C05
    • State/local - can be explicitly reported or omitted, depending on how we want to do it.

This would give us a consistent framework for how to report the universe and the crossings. I think in the OPM beta tabulations we released we used A01, which is problematic, but that was just a beta release. It doesn't hurt us to adopt this paradigm. If we want to go with this, it would be pretty trivial to add the agg_level variable it to the QWI csv files, and it is already in the schema. We would also have to remove the "A01 Federal" from the schema, and change the current ownercode label to be "All state, local, private", or some such.

Alternatively, we could use this approach:

  • Redefine A00 to be state+local+private (which it is, we just don't say it)
  • A01 can still be federal, state totals should be agg_level=1057 (this is quirky, but there is nothing to say that A01+A05+[state/local residual]=A00
  • A05 can still be private, still state totals are agg_level=1057
  • Use B00 for fed+state+local+private, state totals agg_level=1025
    • To me, I would prefer that Federal workers would be B01, so that B01+A05+[state/local residual]=B00

The distinction of the agg_level matters if we ever were to cross J2J by ownership. We would also need to have an explicit residual between the total (A00) and any parts (A05), let's call it (A0R) for now. And then if we added Federal workers into the J2J universe, we would have to use an alternate code for the total (B00), and then the components (A01, A05, B0R). The appropriate agg_level values would be used, depending on whether it is a total across all ownership categories, or if is by detailed ownership categories, however those detailed categories are defined.

Whatever we do, I think that somewhere in the schema we need to describe that the A00 code does NOT contain federal workers. We don't do that now, and I'm not sure how users are supposed to figure it out.

@srt1 srt1 added this to the V4.3.0 milestone Sep 12, 2017

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Oct 30, 2017

Member

For one, we could

  • relabel A00 to be clear about not including federal ownership
  • define a new Bxx category
  • announce a switch from Axx to Bxx for all downstream users

I suggest not defining residuals, but correctly describe the universe. It would be great if agg_level can be incorporated into it (different owners, same agg_levels seems confusing).

Member

larsvilhuber commented Oct 30, 2017

For one, we could

  • relabel A00 to be clear about not including federal ownership
  • define a new Bxx category
  • announce a switch from Axx to Bxx for all downstream users

I suggest not defining residuals, but correctly describe the universe. It would be great if agg_level can be incorporated into it (different owners, same agg_levels seems confusing).

@srt1

This comment has been minimized.

Show comment
Hide comment
@srt1

srt1 Oct 30, 2017

Collaborator
Collaborator

srt1 commented Oct 30, 2017

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Dec 15, 2017

Member

@srt1 : any progress on thinking this out?

Just a note: "All own" -> state isn't actually a transition that makes sense to me.

Member

larsvilhuber commented Dec 15, 2017

@srt1 : any progress on thinking this out?

Just a note: "All own" -> state isn't actually a transition that makes sense to me.

@srt1

This comment has been minimized.

Show comment
Hide comment
@srt1

srt1 Dec 15, 2017

Collaborator
Collaborator

srt1 commented Dec 15, 2017

@srt1

This comment has been minimized.

Show comment
Hide comment
@srt1

srt1 Dec 18, 2017

Collaborator

Let me restate my thoughts on how I see the agg_level and ownership fields interacting, and assorted tasks for implementation:

  • The OWNERSHIP code is used to define the universe in scope for the tabulation
  • The AGG_LEVEL code is used to identify which characteristics are provided detailed info on a particular record (i.e., not a subtotal for that characteristic).
  • In QWI, A00 is the universe, and should be redefined to be "State, Local, Private Ownership". This corresponds to agg_levels for subtotals across all ownership categories.
  • In QWI, A05 is a detail aggregation, and corresponds to agg_levels for detailed ownership categories.
    • We could conceptually provide a residual for the difference between A00 an A05, at the same agg_level as A05.
  • In J2J, all published aggregations so far correspond to agg_levels for A00 on QWI. If we were to interact with detailed ownership categories (public/private), the agg_levels would correspond to those for A05.
    • If we were to do this, we could opt to use an alternate ownership classification, perhaps:
      • B00: All ownership, state+local+private (subtotal)
      • B01: Public sector, state+local
      • B05: Private sector
    • Using alternate ownership categories does not change the agg_levels
    • Origin-Destination tabulations can be a subtotal (all in-scope ownership categories for table) on one side and detailed ownership categories on the other. This would be indicated in the agg_level table.
  • The ownership for OPM should be of a separate scheme, perhaps C01. You could consider this either the same aggregation level as A00 or that for A05 - it's fairly immaterial in some ways, since there is no distinction between subtotals and detail aggreations; though I might argue it should be considered a "detail" row (a la A05 in QWI).
    • If we ever rolled up UI + federal data, it would be a separate scheme; perhaps:
      • D00: All ownership, federal+state+local+private (subtotal)
      • D01: Federal
      • D02: State+local
      • D05: Private
    • All records with D01-D05 would the same agg_level as the QWI A05 records.

I think this framework is a consistent way of presenting both concepts. I agree that mixing A00 and B01 within a tabulation would be really funky, and we shouldn't go that route. But otherwise, I think this is a reasonable path forward. For public facing tasks, for 4.2 we should redefine what A00 means (QWI and J2J); and perhaps in 4.3 we will add agg_level to QWIPU csv files (for 2018Q3 production?).

Collaborator

srt1 commented Dec 18, 2017

Let me restate my thoughts on how I see the agg_level and ownership fields interacting, and assorted tasks for implementation:

  • The OWNERSHIP code is used to define the universe in scope for the tabulation
  • The AGG_LEVEL code is used to identify which characteristics are provided detailed info on a particular record (i.e., not a subtotal for that characteristic).
  • In QWI, A00 is the universe, and should be redefined to be "State, Local, Private Ownership". This corresponds to agg_levels for subtotals across all ownership categories.
  • In QWI, A05 is a detail aggregation, and corresponds to agg_levels for detailed ownership categories.
    • We could conceptually provide a residual for the difference between A00 an A05, at the same agg_level as A05.
  • In J2J, all published aggregations so far correspond to agg_levels for A00 on QWI. If we were to interact with detailed ownership categories (public/private), the agg_levels would correspond to those for A05.
    • If we were to do this, we could opt to use an alternate ownership classification, perhaps:
      • B00: All ownership, state+local+private (subtotal)
      • B01: Public sector, state+local
      • B05: Private sector
    • Using alternate ownership categories does not change the agg_levels
    • Origin-Destination tabulations can be a subtotal (all in-scope ownership categories for table) on one side and detailed ownership categories on the other. This would be indicated in the agg_level table.
  • The ownership for OPM should be of a separate scheme, perhaps C01. You could consider this either the same aggregation level as A00 or that for A05 - it's fairly immaterial in some ways, since there is no distinction between subtotals and detail aggreations; though I might argue it should be considered a "detail" row (a la A05 in QWI).
    • If we ever rolled up UI + federal data, it would be a separate scheme; perhaps:
      • D00: All ownership, federal+state+local+private (subtotal)
      • D01: Federal
      • D02: State+local
      • D05: Private
    • All records with D01-D05 would the same agg_level as the QWI A05 records.

I think this framework is a consistent way of presenting both concepts. I agree that mixing A00 and B01 within a tabulation would be really funky, and we shouldn't go that route. But otherwise, I think this is a reasonable path forward. For public facing tasks, for 4.2 we should redefine what A00 means (QWI and J2J); and perhaps in 4.3 we will add agg_level to QWIPU csv files (for 2018Q3 production?).

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Dec 18, 2017

Member
Member

larsvilhuber commented Dec 18, 2017

@srt1

This comment has been minimized.

Show comment
Hide comment
@srt1

srt1 Dec 18, 2017

Collaborator
Collaborator

srt1 commented Dec 18, 2017

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Dec 18, 2017

Member
Member

larsvilhuber commented Dec 18, 2017

@srt1

This comment has been minimized.

Show comment
Hide comment
@srt1

srt1 Dec 18, 2017

Collaborator
Collaborator

srt1 commented Dec 18, 2017

@larsvilhuber

This comment has been minimized.

Show comment
Hide comment
@larsvilhuber

larsvilhuber Dec 18, 2017

Member

Split out #77

Member

larsvilhuber commented Dec 18, 2017

Split out #77

@larsvilhuber larsvilhuber added this to To do in Schemas Mar 24, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment