Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

geocodePL_get returns sf data.frame #43

Merged
merged 12 commits into from
Oct 31, 2020
Merged

geocodePL_get returns sf data.frame #43

merged 12 commits into from
Oct 31, 2020

Conversation

BERENZ
Copy link
Contributor

@BERENZ BERENZ commented Oct 30, 2020

Proposal for a changed output of geocodePL_get. Now it returns object of a sf data.frame class instead of a list.

R/geocodePL_get.R Outdated Show resolved Hide resolved
R/geocodePL_get.R Outdated Show resolved Hide resolved
R/geocodePL_get.R Outdated Show resolved Hide resolved
@kadyb
Copy link
Owner

kadyb commented Oct 30, 2020

output = jsonlite::fromJSON(prepared_URL)[["results"]]
if (!is.null(output)) {

output can be NULL?

@BERENZ
Copy link
Contributor Author

BERENZ commented Oct 30, 2020

Yes, it can be. For example when address was not found

> address = "Poznań, Aleja Niepodległości"
> base_URL = "https://services.gugik.gov.pl/uug/?request=GetAddress&address="
> prepared_URL = paste0(base_URL, address)
> prepared_URL = gsub(" ", "%20", prepared_URL)
> output = jsonlite::fromJSON(prepared_URL)[["results"]]
> output
NULL

@kadyb
Copy link
Owner

kadyb commented Oct 30, 2020

Good catch!

@BERENZ
Copy link
Contributor Author

BERENZ commented Oct 30, 2020

I noticed that it is very sensitive to the address specification. See for instance the following examples:

> address = "Poznań, Ząbkowicka 12H"
> base_URL = "https://services.gugik.gov.pl/uug/?request=GetAddress&address="
> prepared_URL = paste0(base_URL, address)
> prepared_URL = gsub(" ", "%20", prepared_URL)
> output = jsonlite::fromJSON(prepared_URL)[["results"]]
> str(output,1)
List of 1
 $ 1:List of 16

> address = "Poznań, Zabkowicka 12H"
> base_URL = "https://services.gugik.gov.pl/uug/?request=GetAddress&address="
> prepared_URL = paste0(base_URL, address)
> prepared_URL = gsub(" ", "%20", prepared_URL)
> output = jsonlite::fromJSON(prepared_URL)[["results"]]
> output
> str(output,1)
List of 1
 $ 1:List of 16

> address = "Poznań, Ząbkowicka 12 H"
> base_URL = "https://services.gugik.gov.pl/uug/?request=GetAddress&address="
> prepared_URL = paste0(base_URL, address)
> prepared_URL = gsub(" ", "%20", prepared_URL)
> output = jsonlite::fromJSON(prepared_URL)[["results"]]
> output
NULL

> address = "Poznań Ząbkowicka 12H"
> base_URL = "https://services.gugik.gov.pl/uug/?request=GetAddress&address="
> prepared_URL = paste0(base_URL, address)
> prepared_URL = gsub(" ", "%20", prepared_URL)
> output = jsonlite::fromJSON(prepared_URL)[["results"]]
> output
NULL

@kadyb
Copy link
Owner

kadyb commented Oct 30, 2020

Suppose

output = jsonlite::fromJSON(prepared_URL)[["results"]]

returns NULL. In that case, stop("all inputs are empty") will be executed?

@BERENZ
Copy link
Contributor Author

BERENZ commented Oct 30, 2020

Yes, see below

> geocodePL_get(address = "Poznań Ząbkowicka 12H")
Error in geocodePL_get(address = "Poznań Ząbkowicka 12H") : 
  all inputs are empty

@kadyb
Copy link
Owner

kadyb commented Oct 30, 2020

So that's not right. This error should appear after the user doesn't enter the input. If jsonlite::fromJSON returns NULL, then NULL should also be returned at the end.

@BERENZ
Copy link
Contributor Author

BERENZ commented Oct 30, 2020

Ok, so I should include the stop function somewhere inside, like here:

if (!nchar(rail_crossing) == 11) {
      stop("rail crossing ID must be 11 characters long")
 }

or maybe rearrange the whole function so the df_output will be created at the very end? This may help to avoid to many stops inside.

@kadyb
Copy link
Owner

kadyb commented Oct 30, 2020

Maybe just before stop("all inputs are empty")
use:

if (is.null(output)) {
  return(NULL)
}

@kadyb
Copy link
Owner

kadyb commented Oct 30, 2020

Oh, sorry. I thought after you commit. It is probably better for the user to be informed that the name is possibly wrong than NULL.
So return(NULL) change to return("object not found") or something similar.

R/geocodePL_get.R Outdated Show resolved Hide resolved
R/geocodePL_get.R Show resolved Hide resolved
R/geocodePL_get.R Outdated Show resolved Hide resolved
@kadyb
Copy link
Owner

kadyb commented Oct 30, 2020

I haven't checked if geocodePL_get() works after the changes yet. I will do it tomorrow.

@BERENZ
Copy link
Contributor Author

BERENZ commented Oct 31, 2020

Ok, please check if my proposal fits your requirements.

How should the unit test for wrong or no address look like? Should I use expect_true to check if output is equal to object not found ?

@kadyb
Copy link
Owner

kadyb commented Oct 31, 2020

How should the unit test for wrong or no address look like? Should I use expect_true to check if output is equal to object not found ?

In testthat there is expect_output() function for this, so expect_output(t6, "object not found").

@kadyb
Copy link
Owner

kadyb commented Oct 31, 2020

test = geocodePL_get(geoname = "Jeziorak")

doesn't work because of NULL in lists.

@BERENZ
Copy link
Contributor Author

BERENZ commented Oct 31, 2020

The error is because of different number of elements in each list

List of 5
 $ 1:List of 13
 $ 2:List of 12
 $ 3:List of 12
 $ 4:List of 13
 $ 5:List of 12

So using the standard do.call and rbind fails. Maybe we could use dplyr::bind_rows ?

@kadyb
Copy link
Owner

kadyb commented Oct 31, 2020

if (length(output) == 1) {
   output[[1]][sapply(output[[1]], is.null)] = NA
}

Maybe the length condition is too restrictive / unfounded? What if the output consists of multiple lists (like in #43 (comment))?
Maybe we should test first if there is NULL in the lists. If so, then replace it in each list with NA.

@kadyb
Copy link
Owner

kadyb commented Oct 31, 2020

So using the standard do.call and rbind fails. Maybe we could use dplyr::bind_rows ?

I would not like to introduce a new dependency. The 13th extra column is "notes", maybe we'll just remove it in each list and then merge the lists?

@BERENZ
Copy link
Contributor Author

BERENZ commented Oct 31, 2020

Done. Why the checks are not passed?

@kadyb
Copy link
Owner

kadyb commented Oct 31, 2020

ERROR (test-emuia_download.R:19:1): (code run outside of test_that())
FAILURE (test-geocodePL_get.R:26:3): check if object was not found
Warning (test-minmaxDTM_get.R:23:1): (code run outside of test_that())

1 and 3 I don't know why. It worked yesterday.
2 is related to:

test_that("check if object was not found", {
  expect_output(t6, "object not found")
})

@kadyb
Copy link
Owner

kadyb commented Oct 31, 2020

Maybe try change to expect_equal(t6, "object not found") because we didn't print object not found exactly (we use return("object not found")). This could be the reason.

@BERENZ
Copy link
Contributor Author

BERENZ commented Oct 31, 2020

Maybe expect_output is not the right choice?

> expect_output(t6, "object not found")
Error: `t6` produced no output

May I use expect_true(t6 == "object not found")?

EDIT: what is the difference between expect_true and expect_equal in this case? Which one is preferred?

@kadyb
Copy link
Owner

kadyb commented Oct 31, 2020

Probably expect_equal is preferable for comparing values. Indeed, maybe expect_true will be better.

@kadyb
Copy link
Owner

kadyb commented Oct 31, 2020

I think I can merge this PR now. Please also add your pearson to DESCRIPTION as ctb before "GUGiK". Thanks for the help.

If you still have time and you would like to, you can address:

  1. Remove unnecessary columns from the output (Output unification from geocode #11).
  2. Creating a helper (private) function to avoid duplicating the code.

@BERENZ
Copy link
Contributor Author

BERENZ commented Oct 31, 2020

Thank you! I will try to help with the development of this package.

@kadyb kadyb merged commit 3292f18 into kadyb:master Oct 31, 2020
@kadyb kadyb mentioned this pull request Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants