Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider adding documentation on searching many terms at once #39

Closed
parmsam opened this issue Mar 4, 2022 · 4 comments
Closed

Consider adding documentation on searching many terms at once #39

parmsam opened this issue Mar 4, 2022 · 4 comments

Comments

@parmsam
Copy link
Contributor

parmsam commented Mar 4, 2022

For example, applying npi_search() on a list of cities or npi numbers. Using purrr in combination with npi package functions. I can work on this.

@frankfarach
Copy link
Collaborator

I like this idea for a second vignette. My only concern with using purrr this way is that it makes it easy to hammer the API with too many requests in a short time period. I wasn't able to find any specific policies on the NPPES website. That said, there are plenty of legitimate use cases for purrr with npi. We can simply model responsible API usage by introducing a delay into a new version of the function, as shown below. The NPPES does regularly release the whole dataset as a free download (~800 Mb) for those who want it.

library(npi)
library(purrr)
library(dplyr, warn.conflicts = FALSE)

# Higher-order function to delay execution of `f` by a specified number of seconds
delay_by <- function(f, seconds) {
  force(f)
  force(seconds)
  
  function(...) {
    Sys.sleep(seconds)
    f(...)
  }
}

# Set up delay
delay <- 5
npi_friendly_search <- delay_by(npi_search, seconds = delay)

# Search and collect multiple specific NPI records into one tibble
# This pattern works for serially executing searches for multiple values of a single query parameter

npis <- c(1992708929, 1831192848, 1699778688, 1111111111)  # Last element doesn't exist

out <- npis %>% 
  purrr::map(., ~ npi_friendly_search(number = .)) %>% 
  dplyr::bind_rows()
#> Requesting records 0-10...
#> Requesting records 0-10...
#> Requesting records 0-10...
#> Requesting records 0-10...

# Print results and summary
out
#> # A tibble: 3 × 11
#>       npi enumeration_type basic    other_names identifiers taxonomies addresses
#>     <int> <chr>            <list>   <list>      <list>      <list>     <list>   
#> 1  1.99e9 Organization     <tibble> <tibble>    <tibble>    <tibble>   <tibble> 
#> 2  1.83e9 Individual       <tibble> <tibble>    <tibble>    <tibble>   <tibble> 
#> 3  1.70e9 Individual       <tibble> <tibble>    <tibble>    <tibble>   <tibble> 
#> # … with 4 more variables: practice_locations <list>, endpoints <list>,
#> #   created_date <dttm>, last_updated_date <dttm>

npi_summarize(out)
#> # A tibble: 3 × 6
#>          npi name       enumeration_type primary_practic… phone primary_taxonomy
#>        <int> <chr>      <chr>            <chr>            <chr> <chr>           
#> 1 1992708929 NOVAMED M… Organization     3200 DOWNWOOD C… 404-… Clinic/Center A…
#> 2 1831192848 MATTHEW J… Individual       3672 MARATHON C… 770-… Orthopaedic Sur…
#> 3 1699778688 STEVEN PA… Individual       5064 NANDINA LN… 770-… Dentist General…

Created on 2022-03-04 by the reprex package (v2.0.1)

@parmsam
Copy link
Contributor Author

parmsam commented Mar 6, 2022

Awesome! I like that idea of adding an adjustable five (or less) second delay into npi_search() for more responsible use. It's important that users understand the underlying dataset can be easily downloaded. I'll include a sentence on this in my next pull request, along with documentation on using purr:::map() with npi_search().

@frankfarach
Copy link
Collaborator

Someone commented yesterday on this issue about running into a deactivated NPI, which triggered an API error that halted their code. Unfortunately, I don't see their comment anymore. In any case, here is a way to safely iterate using purrr::safely():

library(tidyverse)
library(npi)

# Safe mode for npi_search() - store results and errors in a list, 
# returning an empty tibble when there is an error
safe_npi_search <- purrr::safely(npi_search, otherwise = tibble::tibble())

# First NPI is deactivated (per the public NPPES deactivation file); 
# the second is still active.
npis <- c(1407954555, 1992776843)

# Use the new function form above to safely iterate over npis despite errors
out <- purrr::map(npis, ~ safe_npi_search(number = .x)) %>% purrr::transpose()
#> Requesting records 0-10...
#> Requesting records 0-10...

# Inspect the output for errors and extract results into one data frame
out$error
#> [[1]]
#> <error/request_logic_error>
#> Error in `npi_handle_response()` at npi/R/api.R:35:2:
#> ! 
#> Field: number
#> CMS deactivated NPI 1407954555. The provider can no longer use this NPI. Our public registry does not display provider information about NPIs that are not in service.
#> Backtrace:
#>   1. purrr::map(npis, ~safe_npi_search(number = .x)) %>% purrr::transpose()
#>   3. purrr::map(npis, ~safe_npi_search(number = .x))
#>   4. global .f(.x[[i]], ...)
#>   5. purrr safe_npi_search(number = .x)
#>  14. npi .f(...)
#>  15. npi:::npi_process_results(...)
#>        at npi/R/npi_search.R:147:2
#>  16. npi:::npi_control_requests(params, user_n = params[["limit"]])
#>        at npi/R/npi_search.R:172:2
#>  17. npi:::npi_get_results(results = results, query = query)
#>        at npi/R/npi_handle_requests.R:72:2
#>  18. npi:::npi_get(npi_url(), query = ...)
#>        at npi/R/npi_handle_requests.R:13:2
#>  19. npi:::npi_api("GET", url, ...)
#>        at npi/R/api.R:44:2
#>  20. npi:::npi_handle_response(resp)
#>        at npi/R/api.R:35:2
#> 
#> [[2]]
#> NULL

res <- out$result %>% dplyr::bind_rows()
res
#> # A tibble: 1 × 11
#>       npi enumeration_type basic    other_names identifiers taxonomies addresses
#>     <int> <chr>            <list>   <list>      <list>      <list>     <list>   
#> 1  1.99e9 Individual       <tibble> <tibble>    <tibble>    <tibble>   <tibble> 
#> # … with 4 more variables: practice_locations <list>, endpoints <list>,
#> #   created_date <dttm>, last_updated_date <dttm>

Created on 2022-03-14 by the reprex package (v2.0.1)

@frankfarach
Copy link
Collaborator

Addressed in the advanced-uses vignette

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants