Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case_when translation for ORACLE does not work #3521

Closed
cderv opened this issue Apr 18, 2018 · 7 comments
Closed

Case_when translation for ORACLE does not work #3521

cderv opened this issue Apr 18, 2018 · 7 comments
Labels
bug an unexpected problem or unintended behavior wip work in progress

Comments

@cderv
Copy link
Contributor

cderv commented Apr 18, 2018

I work with an oracle connection so difficult to make a reprex sorry. I just will use the code as-is as I think it is enough to see the issue in SQL translation.

When using dplyr on an oracle connection, I had this error

tu_cc_pt %>% 
mutate(niv_u = case_when(
  	PT_NIVEAU_TENSION == "90 KV" ~ "HT",
  	PT_NIVEAU_TENSION == "63 KV" ~ "HT",
  	TRUE ~ PT_NIVEAU_TENSION)) %>%
collect()

The error message is

Error in new_result(connection@ptr, statement) : nanodbc/nanodbc.cpp:1344: HY000: ORA-00920: invalid relational operator 

A call to show_query instead of collect gives the following SQL

SELECT "ID_POSTE_TECHNIQUE", "PT_IDR", "PT_CUR", "PT_TYPE", "PT_STATUT_SEQ", "PT_STATUT_NON_SEQ", "PT_NIVEAU_TENSION", "PT_FILERIE_LIBELLE", "PT_TECHNOLOGIE", "PT_PLAN_PROTECTION", "PT_DT_PREM_MISE_SS_TENSION", "PT_CM", "PT_GMR", CASE
WHEN ("PT_NIVEAU_TENSION" = '90 KV') THEN ('HT')
WHEN ("PT_NIVEAU_TENSION" = '63 KV') THEN ('HT')
WHEN (TRUE) THEN ("PT_NIVEAU_TENSION")
END AS "niv_u"
FROM (DMS_SIDONI_DL_RO.DMBC_CC_PT)

This as been mentionned in #2928 but not fixed it seems.

I think the problem is with WHEN (TRUE) that is not valid with ORACLE. It should be a ELSE

SELECT "ID_POSTE_TECHNIQUE", "PT_IDR", "PT_CUR", "PT_TYPE", "PT_STATUT_SEQ", "PT_STATUT_NON_SEQ", "PT_NIVEAU_TENSION", "PT_FILERIE_LIBELLE", "PT_TECHNOLOGIE", "PT_PLAN_PROTECTION", "PT_DT_PREM_MISE_SS_TENSION", "PT_CM", "PT_GMR", CASE
WHEN ("PT_NIVEAU_TENSION" = '90 KV') THEN ('HT')
WHEN ("PT_NIVEAU_TENSION" = '63 KV') THEN ('HT')
ELSE ("PT_NIVEAU_TENSION")
END AS "niv_u"
FROM (DMS_SIDONI_DL_RO.DMBC_CC_PT)

What do you think ?

If you need another simpler example, please tell me I will work on one.

@copernican
Copy link

As a workaround, you might try 1L == 1L ~ value in place of TRUE ~ value.

@cderv
Copy link
Contributor Author

cderv commented Apr 29, 2018

I tried something to add the correct translation to oracle CASE WHEN.

See case-when-oracle branch in my fork.

Fisrt, add in a case_when in sql_translator

sql_translate_env.Oracle <- function(con) {
  sql_variant(
    sql_translator(.parent = base_odbc_scalar,
      # Data type conversions are mostly based on this article
      # https://docs.oracle.com/cd/B19306_01/server.102/b14200/sql_elements001.htm
      as.character  = sql_cast("VARCHAR(255)"),
      as.numeric    = sql_cast("NUMBER"),
      as.double     = sql_cast("NUMBER"),
      case_when = function(...) sql_case_when_oracle(...)
    ),
    base_odbc_agg,
    base_odbc_win
  )
}

and a sql_case_when_oracle function base on sql_case_when but just adding a special treatment if last formula in dplyr::case_when is like TRUE ~ <something>:
if (query[[n]] == "TRUE") clauses[[n]] <- paste0("ELSE (", value[[n]], ")") }.

sql_case_when_oracle <- function(...) {
  # TODO: switch to dplyr::case_when_prepare when available

  formulas <- dots_list(...)
  n <- length(formulas)

  if (n == 0) {
    abort("No cases provided")
  }

  query <- vector("list", n)
  value <- vector("list", n)

  for (i in seq_len(n)) {
    f <- formulas[[i]]

    env <- environment(f)
    query[[i]] <- escape(eval_bare(f[[2]], env), con = sql_current_con())
    value[[i]] <- escape(eval_bare(f[[3]], env), con = sql_current_con())
  }

  clauses <- purrr::map2_chr(query, value, ~ paste0("WHEN (", .x, ") THEN (", .y, ")"))
  # if a formula like TRUE ~ "other" is at the end of a sequence, use ELSE statement
  if (query[[n]] == "TRUE") {
    clauses[[n]] <- paste0("ELSE (", value[[n]], ")")
  }
  sql(paste0(
    "CASE\n",
    paste0(clauses, collapse = "\n"),
    "\nEND"
  ))
}

there is surely other way to do that and I am willing to help with a PR if you put a the correct path, one that you could accept.
It is the first time I dig into dbplyr translation mechanism, so I am not sure to be able to see all the way to implement correct translation.

Moreover, a CASE WHEN ... THEN ... ELSE .... statement may be more generic and this fix could be apply to all backend.

@edgararuiz have you some advices for me on this ? Thanks.

@edgararuiz-zz
Copy link

Hi @cderv , thank you for researching this. Let me take a closer look and follow up with you. Thank you for your willingness to send over a PR.

@hadley
Copy link
Member

hadley commented May 20, 2018

I think the right way to fix this is to modify sql_case_when for all databases, using something like the approach you propose. I'd be happy to review a PR.

@hadley hadley added bug an unexpected problem or unintended behavior wip work in progress labels May 20, 2018
@cderv
Copy link
Contributor Author

cderv commented May 23, 2018

OK I'll work on something.

@ghost
Copy link

ghost commented Jun 25, 2018

This issue was moved by krlmlr to tidyverse/dbplyr/issues/112.

@ghost ghost closed this as completed Jun 25, 2018
@lock
Copy link

lock bot commented Dec 22, 2018

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Dec 22, 2018
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug an unexpected problem or unintended behavior wip work in progress
Projects
None yet
Development

No branches or pull requests

5 participants