Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NA values break annotator #3

Open
JBGruber opened this issue Apr 12, 2023 · 4 comments
Open

NA values break annotator #3

JBGruber opened this issue Apr 12, 2023 · 4 comments

Comments

@JBGruber
Copy link
Member

Reprex:

df <- data.frame(
  url = LETTERS,
  headline = LETTERS,
  text = letters
)
df$headline[3] <- NA

library(annotinder)
units <- create_units(
  df,
  id = "url",
  text = set_text(name = "Petition", value = text, label = headline),
  url = set_markdown(name = "URL", value = url, label = "Original URL")
)

codebook <- question(
  name = "relevant",
  question = "Is this text relevant?",
  codes = c("Yes", "No"),
  type = "annotinder"
) |>
  create_codebook()

job <- create_job("relevant_petition", units, codebook)
job_db_file <- create_job_db(job, overwrite = TRUE)

start_annotator(job_db_file,
                background = TRUE, browse = TRUE)

First two units work normally, the third stops the annotation process and leaves me with a blank white browser window.

@JBGruber
Copy link
Member Author

I just noticed: This only applies to the label value. NAs are converted to null in the internal json format.

@kasperwelbers
Copy link
Member

Thanks, I'll look into it!

One thing about how you now create the units. Is it intentional that you want to use the headline as the label of the "petition" field? It's actually a bit coincidental that this is now possible (though it shouldn't harm). I initially intended labels to be fixed values for fields, so coders now what field they're working in (which mainly makes sense in the annotation mode where you label texts).

For adding things like headlines, my current idea was to have two main approaches.

  • use markdown all the way
  • use another set_text to add a field for the headline. This is a bit more flexible because you can style it as you'd like

@JBGruber
Copy link
Member Author

I totally agree that it does not make a lot of sense to use the headlines for this. Markdown is a better idea for sure.

Still it would be good to make sure NA values are either transformed into something that annotinder can deal with or set_text should error verbosely.

Let me know which one you prefer and I can do a PR.

@kasperwelbers
Copy link
Member

Perhaps it's best for now to throw an error for an NA (which we should be able to just add in eval_value in create_units.r).

For user experience it might be nice if it automatically infers NA as an empty string IF it concerns a string value, but not sure how easy it is to implement this in R with the current design. I went a bit overboard in allowing expressions for all field attributes (value, label, style settings), and not sure how easy it is to infer types.

Perhaps it's easier to solve this in the Typescript client. The benefit would then be that the solution immediately works for other clients (e.g., in Python). So if we for now just throw an error for NA's, we might just remove this once the client handles missing values properly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants