Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could reproduce certain code, can you help? #5

Closed
billchenxi opened this issue Jan 24, 2017 · 1 comment
Closed

Could reproduce certain code, can you help? #5

billchenxi opened this issue Jan 24, 2017 · 1 comment
Labels

Comments

@billchenxi
Copy link

library(biomaRt)
myMart <- useMart("ensembl", dataset = "hsapiens_gene_ensembl")
ensemblID <- "ENSG00000008130"
unip <- useDataset("uniprot",mart=useMart("unimart"))

interpro <- getBM(attributes=c("interpro_id","go_id"),
                    filters="ensembl_id", values="ENSG00000008130",mart=unip)

Error in useMart("unimart") :
Incorrect BioMart name, use the listMarts function to see which BioMart databases are available

Thanks.

@HajkD
Copy link
Member

HajkD commented Jan 24, 2017

Hi,

Please note that this package is called biomartr and not biomaRt.
biomartr is designed to extend the biomaRt package and provides a
user-friendly search strategy for retrieving annotation information stored in
the BioMart database.

First, the biomartr package allows you to retrieve the correct
attribute names by typing:

biomartr::organismAttributes("Homo sapiens", topic = "interpro")
name                description               dataset                 mart
<chr>                      <chr>                 <chr>                <chr>
    1                    interpro                Interpro ID hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
2  interpro_short_description Interpro Short Description hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
3        interpro_description       Interpro Description hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
4              interpro_start             Interpro start hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
5                interpro_end               Interpro end hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
6  interpro_short_description InterPro Short Description    hsapiens_gene_vega ENSEMBL_MART_ENSEMBL
7                    interpro                Interpro ID    hsapiens_gene_vega ENSEMBL_MART_ENSEMBL
8        interpro_description       InterPro Description    hsapiens_gene_vega ENSEMBL_MART_ENSEMBL
9              interpro_start             Interpro start    hsapiens_gene_vega ENSEMBL_MART_ENSEMBL
10               interpro_end               Interpro end    hsapiens_gene_vega ENSEMBL_MART_ENSEMBL
# ... with 40 more rows

As you will observe, in line 7 the attribute for the Interpro ID is named interpro
and not interpro_id as specified by you.

Whereas the attribute go_id is absolutely correct:

biomartr::organismAttributes("Homo sapiens", topic = "go")
name                                  description               dataset                 mart
<chr>                                        <chr>                 <chr>                <chr>
    1                            go_id                            GO Term Accession hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
2                  go_linkage_type                        GO Term Evidence Code hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
3             goslim_goa_accession                      GOSlim GOA Accession(s) hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
4           goslim_goa_description                       GOSlim GOA Description hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
5         vpacos_homolog_goc_score         Alpaca Gene-order conservation score hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
6       pformosa_homolog_goc_score   Amazon molly Gene-order conservation score hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
7  acarolinensis_homolog_goc_score   Anole lizard Gene-order conservation score hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
8  dnovemcinctus_homolog_goc_score      Armadillo Gene-order conservation score hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
9     ogarnettii_homolog_goc_score       Bushbaby Gene-order conservation score hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
10 cintestinalis_homolog_goc_score C.intestinalis Gene-order conservation score hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
# ... with 435 more rows

The biomartr function organismFilters() will now allow you to retrieve the
correct filter name for your query:

biomartr::organismFilters("Homo sapiens", topic = "ensembl")
name                                                  description                       dataset                 mart
<chr>                                                        <chr>                         <chr>                <chr>
    1        with_ox_clone_based_ensembl_gene                          with clone based Ensembl gene ID(s)         hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
2  with_ox_clone_based_ensembl_transcript                    with clone based Ensembl transcript ID(s)         hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
3                         ensembl_gene_id                            Gene ID(s) [e.g. ENSG00000139618]         hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
4                   ensembl_transcript_id                      Transcript ID(s) [e.g. ENST00000380152]         hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
5                      ensembl_peptide_id                         Protein ID(s) [e.g. ENSP00000369497]         hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
6                         ensembl_exon_id                            Exon ID(s) [e.g. ENSE00001508081]         hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
7           clone_based_ensembl_gene_name           Clone based Ensembl gene name(s) [e.g. AL162430.1]         hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
8     clone_based_ensembl_transcript_name Clone based Ensembl transcript name(s) [e.g. AL162430.1-201]         hsapiens_gene_ensembl ENSEMBL_MART_ENSEMBL
9                            ensembl_gene                                  Gene stable ID(s) [Max 500]                  hsapiens_snp ENSEMBL_MART_ENSEMBL
10                        ensembl_gene_id  Ensembl Gene ID(s) (e.g. ENSG00000210049) [Max 500 advised] hsapiens_mirna_target_feature ENSEMBL_MART_ENSEMBL
# ... with 40 more rows

Thus, as stated in line 3 the correct filter name for the ensembl ID is ensembl_gene_id
and not ensembl_id.

Finally, you can use the getMarts() function of biomartr to see available marts for Homo sapiens:

biomartr::getMarts()
                  mart               version
1  ENSEMBL_MART_ENSEMBL      Ensembl Genes 87
2    ENSEMBL_MART_MOUSE      Mouse strains 87
3 ENSEMBL_MART_SEQUENCE              Sequence
4 ENSEMBL_MART_ONTOLOGY              Ontology
5  ENSEMBL_MART_GENOMIC   Genomic features 87
6      ENSEMBL_MART_SNP  Ensembl Variation 87
7  ENSEMBL_MART_FUNCGEN Ensembl Regulation 87
8     ENSEMBL_MART_VEGA               Vega 67

As you can see, when retrieving information for genes the ENSEMBL_MART_ENSEMBL
mart covers Ensembl Genes 87.

When you now run the biomart() function of the biomartr package you will
retrieve all results:

biomartr::biomart(genes = "ENSG00000008130",
                  mart = "ENSEMBL_MART_ENSEMBL",
                  dataset = "hsapiens_gene_ensembl",
                  attributes = c("interpro","go_id"),
                  filters = "ensembl_gene_id"
                  )
ensembl_gene_id  interpro      go_id
1  ENSG00000008130 IPR002504 GO:0005829
2  ENSG00000008130 IPR002504 GO:0046034
3  ENSG00000008130 IPR002504 GO:0019674
4  ENSG00000008130 IPR002504 GO:0046872
5  ENSG00000008130 IPR002504 GO:0016740
6  ENSG00000008130 IPR002504 GO:0005524
7  ENSG00000008130 IPR002504 GO:0005515
8  ENSG00000008130 IPR002504 GO:0003951
9  ENSG00000008130 IPR002504 GO:0000166
10 ENSG00000008130 IPR002504 GO:0016310
11 ENSG00000008130 IPR002504 GO:0008152
12 ENSG00000008130 IPR002504 GO:0006741
13 ENSG00000008130 IPR002504 GO:0016301
14 ENSG00000008130 IPR016064 GO:0005829
15 ENSG00000008130 IPR016064 GO:0046034
16 ENSG00000008130 IPR016064 GO:0019674
17 ENSG00000008130 IPR016064 GO:0046872
18 ENSG00000008130 IPR016064 GO:0016740
19 ENSG00000008130 IPR016064 GO:0005524
20 ENSG00000008130 IPR016064 GO:0005515
21 ENSG00000008130 IPR016064 GO:0003951
22 ENSG00000008130 IPR016064 GO:0000166
23 ENSG00000008130 IPR016064 GO:0016310
24 ENSG00000008130 IPR016064 GO:0008152
25 ENSG00000008130 IPR016064 GO:0006741
26 ENSG00000008130 IPR016064 GO:0016301
27 ENSG00000008130 IPR017437 GO:0005829
28 ENSG00000008130 IPR017437 GO:0046034
29 ENSG00000008130 IPR017437 GO:0019674
30 ENSG00000008130 IPR017437 GO:0046872
31 ENSG00000008130 IPR017437 GO:0016740
32 ENSG00000008130 IPR017437 GO:0005524
33 ENSG00000008130 IPR017437 GO:0005515
34 ENSG00000008130 IPR017437 GO:0003951
35 ENSG00000008130 IPR017437 GO:0000166
36 ENSG00000008130 IPR017437 GO:0016310
37 ENSG00000008130 IPR017437 GO:0008152
38 ENSG00000008130 IPR017437 GO:0006741
39 ENSG00000008130 IPR017437 GO:0016301
40 ENSG00000008130 IPR017438 GO:0005829
41 ENSG00000008130 IPR017438 GO:0046034
42 ENSG00000008130 IPR017438 GO:0019674
43 ENSG00000008130 IPR017438 GO:0046872
44 ENSG00000008130 IPR017438 GO:0016740
45 ENSG00000008130 IPR017438 GO:0005524
46 ENSG00000008130 IPR017438 GO:0005515
47 ENSG00000008130 IPR017438 GO:0003951
48 ENSG00000008130 IPR017438 GO:0000166
49 ENSG00000008130 IPR017438 GO:0016310
50 ENSG00000008130 IPR017438 GO:0008152
51 ENSG00000008130 IPR017438 GO:0006741
52 ENSG00000008130 IPR017438 GO:0016301
53 ENSG00000008130 

I hope this example demonstrates the advantage of the biomartr
package over the biomaRt package when constructing BioMart queries.

Please don't forget to cite my paper, so that I can keep
developing this package :)

Hajk-Georg Drost, Jerzy Paszkowski; Biomartr: genomic data retrieval with R. Bioinformatics 2017 btw821. doi: 10.1093/bioinformatics/btw821

Many thanks in advance and I hope I could help you :)

Best wishes
Hajk

@HajkD HajkD closed this as completed Jan 24, 2017
@HajkD HajkD reopened this Jan 24, 2017
@HajkD HajkD added the question label Jan 24, 2017
@HajkD HajkD closed this as completed Feb 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants