Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

function gson doesn't properly handle the keytype argument #9

Open
guidohooiveld opened this issue Apr 30, 2024 · 1 comment
Open

Comments

@guidohooiveld
Copy link

guidohooiveld commented Apr 30, 2024

At the clusterProfiler GitHub an error was reported that originates from an 'incomplete' GSON object.
See: YuLab-SMU/clusterProfiler#685 (comment)

I did some debugging, and found that it seems to originate from the gson function, hence my post here. Somehow the slot keytype remains empty, although a value for it is being provided.

@GuangchuangYu : could please have a look at this?

gson/R/gson.R

Lines 27 to 29 in 519e148

gson <- function(gsid2gene, gsid2name = NULL, gene2name = NULL,
species = NULL, gsname = NULL, version = NULL,
accessed_date = NULL, keytype = NULL, urlpattern = NULL, info = NULL) {

Code illustrating problem (as per post at the clusterProfiler GitHub)

> ## load library
> library(clusterProfiler)
> 
> ## some ids
> id_transform <- c("240427","12705","241770","102633301","319757","116903","72309")
> 
> ## generate GSON-object with pathway information
> kk <- gson_KEGG('mmu')
> 
> ## use GSON as input: FAILS!
> KEGG_enrich = enrichKEGG(gene = id_transform,
+                          organism=kk,
+                          use_internal_data = FALSE)
Error in (function (cl, name, valueClass)  : 
  assignment of an object of class “NULL” is not valid for @‘keytype’ in an object of class “enrichResult”; is(value, "character") is not TRUE
> 
> 
> ## check GSON-object
> kk
>> Gene Set: KEGG
>> 9710 genes annotated by 355 gene sets.
>> Species: mmu
>> Version: Release 110.0+/04-27, Apr 24
> 
> ## note that slot keytype is NULL!
> str(kk)
Formal class 'GSON' [package "gson"] with 9 slots
  ..@ gsid2gene    :'data.frame':       38640 obs. of  2 variables:
  .. ..$ gsid: chr [1:38640] "mmu00010" "mmu00010" "mmu00010" "mmu00010" ...
  .. ..$ gene: chr [1:38640] "103988" "106557" "110695" "11522" ...
  ..@ gsid2name    :'data.frame':       355 obs. of  2 variables:
  .. ..$ gsid: chr [1:355] "mmu01100" "mmu01200" "mmu01210" "mmu01212" ...
  .. ..$ name: chr [1:355] "Metabolic pathways - Mus musculus (house mouse)" "Carbon metabolism - Mus musculus (house mouse)" "2-Oxocarboxylic acid metabolism - Mus musculus (house mouse)" "Fatty acid metabolism - Mus musculus (house mouse)" ...
  ..@ gene2name    : NULL
  ..@ species      : chr "mmu"
  ..@ gsname       : chr "KEGG"
  ..@ version      : chr "Release 110.0+/04-27, Apr 24"
  ..@ accessed_date: chr "2024-04-30"
  ..@ keytype      : NULL
  ..@ info         : NULL
> 

Perform some debugging
debug function gson_kegg

https://github.com/YuLab-SMU/clusterProfiler/blob/2ab30a92f1791dce75f71ea29b71c33fc443d4a0/R/gson.R#L12-L25

> library(clusterProfiler)
> library(gson)
> 
> ## define unexposed help function 'kegg_release'
> ## https://github.com/YuLab-SMU/clusterProfiler/blob/2ab30a92f1791dce75f71ea29b71c33fc443d4a0/R/gson.R#L285-L289
> kegg_release = function(db){
+   url = paste0("https://rest.kegg.jp/info/", db)
+   y = readLines(url)
+   release <- sub("\\w+\\s+", "", y[grep('Release', y)])
+   return(release) }
> 
> ## define required arguments
> species = 'hsa'
> KEGG_Type="KEGG"
> keyType="kegg"
> 
> ## run code (as per function gson_kegg)
> x <- download_KEGG(species, KEGG_Type, keyType)
Reading KEGG annotation online: "https://rest.kegg.jp/link/hsa/pathway"...
Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/hsa"...
> gsid2gene <- setNames(x[[1]], c("gsid", "gene"))
> gsid2name <- setNames(x[[2]], c("gsid", "name"))
> version <- kegg_release(species)
> 
> ## create GSON-object
> My.GSON <-gson(gsid2gene = gsid2gene, 
+                gsid2name = gsid2name,
+                species = species,
+                gsname = "KEGG",
+                version = version,
+                accessed_date = as.character(Sys.Date(),
+                keytype = "ENTREZID")
+                )
> 
> 
> ## check
> ## Note that slot keytype = NULL! And not "ENTREZID".
> str(My.GSON)
Formal class 'GSON' [package "gson"] with 9 slots
  ..@ gsid2gene    :'data.frame':       36987 obs. of  2 variables:
  .. ..$ gsid: chr [1:36987] "hsa00010" "hsa00010" "hsa00010" "hsa00010" ...
  .. ..$ gene: chr [1:36987] "10327" "124" "125" "126" ...
  ..@ gsid2name    :'data.frame':       359 obs. of  2 variables:
  .. ..$ gsid: chr [1:359] "hsa01100" "hsa01200" "hsa01210" "hsa01212" ...
  .. ..$ name: chr [1:359] "Metabolic pathways" "Carbon metabolism" "2-Oxocarboxylic acid metabolism" "Fatty acid metabolism" ...
  ..@ gene2name    : NULL
  ..@ species      : chr "hsa"
  ..@ gsname       : chr "KEGG"
  ..@ version      : chr "Release 110.0+/04-28, Apr 24"
  ..@ accessed_date: chr "2024-04-30"
  ..@ keytype      : NULL
  ..@ info         : NULL
> 
@guidohooiveld guidohooiveld changed the title function gson doens't properly handle the keytype argument function gson doesn't properly handle the keytype argument Apr 30, 2024
@guidohooiveld
Copy link
Author

@GuangchuangYu : I noticed on the GitHub pages that during the last days you are actively working on a variety of your projects. Therefore a kind reminder to 'squash this bug' as well. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant