How to define the cutoff threshold? #17

seb-garcia · 2023-11-17T00:16:22Z

Hello from Peru I am very interested in the work you have done with the MPI. It has been helping us so much on a report we are writing to assess living conditions of Venezuelan refugees and migrants in Peru.

I am trying to implement your package into a NSO survey applied to Venezuelan population (ENPOVE 2022)* in Peru. Our repository is at this github Repo.

I am facing a problem on how the cutoff threshold works. First, I have tried to use a condition using dplyr grammar:

deprivation_profile$year_schooling <- df_household_roster |>
  define_deprivation(
    .indicator = year_schooling,
    .cutoff = (P501 < 6 | P501B < 6  ) & P205_A>17,
    .collapse = TRUE
  )

And I get this error message:

Warning: There were 1140 warnings in `dplyr::summarise()`.
The first warning was:
ℹ In argument: `Years of schooling = max(`Years of schooling`, na.rm = T)`.
ℹ In group 17: `uuid = "00038002381"`.
Caused by warning in `max()`:
! no non-missing arguments to max; returning -Inf
ℹ Run `dplyr::last_dplyr_warnings()` to see the 1139 remaining warnings.

However when I do:

deprivation_profile$year_schooling <- df_household_roster |>
  define_deprivation(
    .indicator = year_schooling,
   # .cutoff = (P501 < 6 | P501B < 6) & P205_A>17,
    .cutoff = year_schooling == 0, 
    .collapse = TRUE
  )

The code works just right.

Do you know what the issue is? Could you help us figure out what is the problem?

Moreover, I assume after I get the deprivation_matrix as an output I can apply a weighting vector to extrapolate the results. Would it work?

Thank you so much for your help!

*Microdata can be downloaded from INEI's website:
For Household

For household members

The text was updated successfully, but these errors were encountered:

yng-me · 2023-11-17T21:27:13Z

Hi, @seb-garcia. On your first question, the condition P501 < 6 | P501B < 6) & P205_A > 17 seems to return NA for some households. Here's my workaround but you should decide on how to treat these NAs. In my case, I coerced NA equal to 0.

deprivation_profile$year_schooling <- df_household_roster |>
  mutate(deprived_year_schooling = if_else((P501 < 6 | P501B < 6) & P205_A > 17, 1, 0, 0)) |> 
  define_deprivation(
    .indicator = year_schooling,
    .cutoff = deprived_year_schooling == 1,
    .collapse = TRUE
  )

I will include additional argument in define_deprivation on how to treat NA as a result of evaluating the deprivation cutoff, so you don't need to do extra steps doing data transformation. Watch out for the next release.

yng-me · 2023-11-17T21:41:40Z

On your other query, yes, you can definitely apply a weighting vector to the deprivation_matrix object returned by using compute_mpi. Please note, though, that compute_mpi is already applying the weights that you define in your specification file under the hood.

yng-me · 2023-12-26T10:05:54Z

See #18

yng-me added enhancement New feature or request question Further information is requested labels Nov 17, 2023

yng-me pushed a commit that referenced this issue Dec 26, 2023

Enhancement based on #17

d59302f

yng-me closed this as completed Dec 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to define the cutoff threshold? #17

How to define the cutoff threshold? #17

seb-garcia commented Nov 17, 2023

yng-me commented Nov 17, 2023 •

edited

yng-me commented Nov 17, 2023

yng-me commented Dec 26, 2023

How to define the cutoff threshold? #17

How to define the cutoff threshold? #17

Comments

seb-garcia commented Nov 17, 2023

yng-me commented Nov 17, 2023 • edited

yng-me commented Nov 17, 2023

yng-me commented Dec 26, 2023

yng-me commented Nov 17, 2023 •

edited