Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get labels and hints #77

Closed
dmenne opened this issue Jun 18, 2020 · 9 comments · Fixed by #98
Closed

How to get labels and hints #77

dmenne opened this issue Jun 18, 2020 · 9 comments · Fixed by #98
Assignees
Labels
external Depends on external changes feature a feature request or enhancement help wanted ❤️ we'd love your help!
Milestone

Comments

@dmenne
Copy link
Contributor

dmenne commented Jun 18, 2020

A snippet to get labels and hints into the form_schema. Must be improved when translations are present. "guidance" field may need a second look, and an XPATH expert could simplify the paths.

Feel free to use or not

library(ruODK)
library(xml2)
library(dplyr)
library(stringr)
library(tidyr)

ru_setup(
  svc = "https:xxxxx",
  un = "yy",
  pw = "xx",
  tz = Sys.timezone()
) 


f_xml = as_xml_document(form_xml())
ff_xml = tibble(
  # Remove trailing /data
  path= str_sub(xml_text(xml_find_all(f_xml, "//translation/text/@id")),6),
  label = xml_text(xml_find_all(f_xml, "//translation/text"))
)%>% 
  separate(path, sep=":", into=c("path", "type")) %>% 
  pivot_wider(names_from=type, values_from=label)

fs_extended = form_schema(flatten = FALSE) %>% 
  left_join(ff_xml, by="path")

Here an example from the xml-file

        <translation default="true()" lang="Deutsch (de)">
          <text id="/data/pat_gruppe:label">
            <value>Patient</value>
          </text>
          <text id="/data/pat_gruppe/pat_no_barcode:label">
            <value>Patienten-Barcode</value>
          </text>
@dmenne dmenne added the feature a feature request or enhancement label Jun 18, 2020
@florianm
Copy link
Collaborator

Thanks Dieter! I'll add this to form_schema_parse (used up to odkc v7) and submit a feature request for the new form_schema (direct JSON from odkc) to include these.

@dmenne
Copy link
Contributor Author

dmenne commented Jun 19, 2020

Better not yet. I will try more forms and update you. This was a first try.

@matthew-white
Copy link

Part of me thinks that it could be useful to add something like this to the Central API: there may be enough use cases where a user needs these labels that the best approach would be for Central to provide that information. Feel free to create a topic in the Features category of the ODK forum if something along those lines would be helpful!

@florianm
Copy link
Collaborator

florianm commented Jun 26, 2020

@matthew-white it would be awesome to have labels and hints included in https://odkcentral.docs.apiary.io/#reference/forms-and-submissions/'-individual-form/getting-form-schema-fields, e.g. as nested list

  {
    "name": "age",
    "path": "/age",
    "type": "int",
    "label": {"en": "Age", "de": "Alter"},
    "hint": {"en": "...", "de": "..."}
  },

Crosslink: Feature request on the ODK Forum

@florianm florianm added this to the CRAN milestone Jun 29, 2020
@florianm florianm added the external Depends on external changes label Jul 9, 2020
@mtyszler
Copy link
Contributor

I submitted an issue at Central (getodk/central#172)

@mtyszler
Copy link
Contributor

mtyszler commented Oct 19, 2020

@dmenne

This is a good solution:

ru_setup(
svc = "https:xxxxx",
un = "yy",
pw = "xx",
tz = Sys.timezone()
)

f_xml = as_xml_document(form_xml())
ff_xml = tibble(
path= str_sub(xml_text(xml_find_all(f_xml, "//translation/text/@id")),6),
label = xml_text(xml_find_all(f_xml, "//translation/text"))
)%>%
separate(path, sep=":", into=c("path", "type")) %>%
pivot_wider(names_from=type, values_from=label)

fs_extended = form_schema(flatten = FALSE) %>%
left_join(ff_xml, by="path")

but please be aware that this works only when there are multiple translations. If there is a single language, labels are stored in a different way (there's no translation element in the XML tree).

I'll submit a suggestion soon.

@mtyszler
Copy link
Contributor

I prepared this function:

library(xml2)


# the function below uses the exact function signature as form_schema()
# in that sense, you could replace any call to form_schema by form_schema_ext
# it gets in addition to the form_schema columns, the common label, and the multilanguage labels if available
# it gets also the choice list and labels, in multilanguage if existing

form_schema_ext <-  function (flatten = FALSE, odata = FALSE, parse = TRUE, pid = get_default_pid(), 
                              fid = get_default_fid(), url = get_default_url(), un = get_default_un(), 
                              pw = get_default_pw(), odkc_version = get_default_odkc_version(), 
                              retries = get_retries(), verbose = get_ru_verbose()) 
{
  
  # gets basic schema
  frm_schema <-form_schema  (flatten, odata , parseE, pid , 
                             fid, url, un, 
                             pw , odkc_version, 
                             retries, verbose)
  
  # gets xml representation
  frm_xml <-  as_xml_document(form_xml (parse, pid, fid, 
                                        url, un, pw , 
                                        retries)) 
  
  
  ### parse translations:
  all_translations <- xml_find_all(frm_xml, "//text")
  
  # initialize dataframe
  extension <- data.frame(path = character(0), label = character(0), 
                          stringsAsFactors = FALSE)
  
  
  ### PART 1: parse labels:
  raw_labels <- xml_find_all(frm_xml, "//label")
  
  # iterate thorugh labels
  for (i in 1:length(raw_labels)){
    
    ## path
    # gets ref from parent, without leading "/data"
    this_path <-  sub("/data", "",
                      xml_attr(xml_parent(raw_labels[i]), "ref"), 
                      6)
    
    # ensure this is a valid path
    if (!is.na(this_path)) {
      
      
      # adds new empty row:
      extension[nrow(extension)+1, ]<-rep(NA, ncol(extension))
      
      # adds path
      extension[nrow(extension), 'path'] <- this_path
      
    
      ## reads label
      this_rawlabel <-raw_labels[i]
      
      # first checks if it is multi-language label
      multi_lang <- xml_has_attr(this_rawlabel, "ref")
      
      if (multi_lang) {
        # if multi-language, finds all translations related to this path:
        id <- paste0("/data", this_path, ":label")
        translations <- all_translations[xml_attr(all_translations, "id") == id]
        
        # iterate through translations
        for (j in 1:length(translations)) {
          
          # first check this is a regular text labels. Questions in ODK can have video, image and audio "labels", 
          # which will be skipped. This is identified by the presence of the 'form' attribute:
          is_regular_label <- !xml_has_attr(xml_find_first(translations[j],"./value"), "form")
          
          if (is_regular_label) {
            # reads the parent node to identify language:
            translation_parent<- xml_parent(translations[j])
            this_lang <- gsub(" ", "_", tolower(xml_attr(translation_parent, "lang")))
            
            # decide if 'default' language or specific language
            if (this_lang == "default") {
              # if 'default' language, save under column 'label':
              extension[nrow(extension), 'label'] <- xml_text(xml_find_first(translations[j],"./value"))
            }
            else {
              # check if language already exists in the datafram
              if (!(paste0("label_",this_lang) %in% colnames(extension))){
                
                # if not, create new column
                extension <- cbind(extension, data.frame(new_lang = rep(NA, nrow(extension))))
                colnames(extension)[ncol(extension)] <- paste0("label_",this_lang)
              }
              
              # adds the first value content of the translation
              extension[nrow(extension), paste0("label_",this_lang)] <- xml_text(xml_find_first(translations[j],"./value"))
            }
            
          }
          
        }
        
      }
      else {
        # extract content
        extension[nrow(extension), 'label'] <- xml_text(this_rawlabel)
        
      }
      
      ### PART 1.1: parse choice labels
      ## checks existence of  choice list:
      choice_items<-xml_find_all(xml_parent(this_rawlabel), "./item")
      if (length(choice_items)>0) {
        
        # check if 'choices' column already exist
        if (!('choices' %in% colnames(extension))){
          
          # if not, create new column
          extension <- cbind(extension, data.frame(choices = rep(NA, nrow(extension))))
        }
        
        # initialize lists
        choice_values <- list()
        choice_labels <- list()
        
        # iterate through choice list:
        for (jj in 1:length(choice_items)) {
          
          #value
          this_choicevalue<-xml_text(xml_find_first(choice_items[jj], "./value"))
          choice_values[jj]<-this_choicevalue
          
          # raw label
          this_rawchoicelabel <- xml_find_first(choice_items[jj], "./label")
          
          # first checks if it is multi-language choice label
          multi_lang_choice <- xml_has_attr(this_rawchoicelabel, "ref")
          
          if (multi_lang_choice) {
            id_choice <- paste0("/data", this_path,"/",this_choicevalue, ":label")
            choice_translations <- all_translations[xml_attr(all_translations, "id") == id_choice]
            
            
            # iterate through choice translations
            for (kk in 1:length(choice_translations)) {
              
              # first check this is a regular text labels. Questions in ODK can have video, image and audio "labels", 
              # which will be skipped. This is identified by the presence of the 'form' attribute:
              is_regular_choicelabel <- !xml_has_attr(xml_find_first(choice_translations[kk],"./value"), "form")
              
              if (is_regular_choicelabel) {
                # reads the parent node to identify language:
                choice_translation_parent<- xml_parent(choice_translations[kk])
                this_choicelang <- gsub(" ", "_", tolower(xml_attr(choice_translation_parent, "lang")))
                
                # decide if 'default' language or specific language
                if (this_choicelang == "default") {
                  # if 'default' language, save under 'choice':
                  choice_labels[['base']][jj] <- xml_text(xml_find_first(choice_translations[kk],"./value"))
                }
                else {
                  # check if language already exists in the dataframe
                  if (!(paste0("choices_",this_choicelang) %in% colnames(extension))){
                    
                    # if not, create new column
                    extension <- cbind(extension, data.frame(new_choicelang = rep(NA, nrow(extension))))
                    colnames(extension)[ncol(extension)] <- paste0("choices_",this_choicelang)
                  }
                  
                  # adds the first value content of the translation
                  choice_labels[[paste0("choices_",this_choicelang)]][jj] <- xml_text(xml_find_first(choice_translations[kk],"./value"))
                }
                
              }
              
            }
          }
          else {
          
            choice_labels[['base']][jj]<- xml_text(this_rawchoicelabel)
          }
        }
        
        # add to the extended table:
        for (this_choicelang in names(choice_labels)) {
          these_choicelabels <- choice_labels[[this_choicelang]]
          
          if (this_choicelang == "base"){
            this_choicelang_colname <- "choices"
          }
          else {
            this_choicelang_colname <-this_choicelang
          }
          
          extension[nrow(extension), this_choicelang_colname] <- list(list(list(values = unlist(choice_values), 
                                                                                labels = unlist(these_choicelabels))))
        }

      }
      
    }
  }
  
  
  # join:
  fs_ext <- frm_schema %>% dplyr::left_join(extension, by = "path")
  
  ##
  return(fs_ext)
}

On top of the function from @dmenne , this provides also choice lists and handles multiple languages:

Here is an example output from a form with a multiple-language labels and single-language choice-list:

path name type ruodk_name label label_english_(en) label_french_(fr) choices
/some_text some_text string some_text NA This is a basic fill in the blank question. (FRENCH) This is a basic fill in the blank question. NA
/text_image_audio_video_test text_image_audio_video_test string text_image_audio_video_test NA This question shows how to use translations and media types. This question shows how to use translations and media types. NA
/a_integer a_integer int a_integer NA Enter a integer: Enter a integer: NA
/a_decimal a_decimal decimal a_decimal NA Enter a decimal: Enter a decimal: NA
/calculate calculate string calculate NA NA NA NULL
/calculate_test_output calculate_test_output string calculate_test_output NA The sum of the integer and decimal: The sum of the integer and decimal: NA
/test_yn test_yn string test_yn NA What do you think? Ça va? list (values = (0 , 1 , 99), labels = ("Yes" ,"No", "Maybe")
/meta meta structure meta NA NA NA NULL
/meta/instanceID instanceID string meta_instance_id NA NA NA NULL

I haven't stress tested it, but I'll try to turn it into a pull request once i have the time.

Feel free to test and comment.

@florianm
Copy link
Collaborator

Nice work! This would warrant a new test form. Unit tests could run against that form, and also against the current forms without translations.

@dmenne
Copy link
Contributor Author

dmenne commented Oct 20, 2020

Great! I had already noted that the function failed sometimes, but did not have the time to test out why. You saved my days!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
external Depends on external changes feature a feature request or enhancement help wanted ❤️ we'd love your help!
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants