We will start by running a query you are now familiar with - "find genes upregulated in Adipose tissue". You have seen how to run this query in the user interface, how to run it using the Python API and now you will run it in R.  First you need to load up the InterMineR library:

In [None]:
library(InterMineR)

and then the intermine database we wish to query, in this case HumanMine.  Note that you can use the listMines() function to list all InterMines available to use.  

In [None]:
im <- initInterMine(mine=listMines()["HumanMine"], token="{your token}")

Create expressedAdipose query object

In [None]:
expressedAdipose = newQuery()

Now set the constraints:

In [None]:
adiposeConstraint1 = setConstraints(
  paths = c("AtlasExpression.condition",
            "AtlasExpression.pValue", "AtlasExpression.expression"),
  operators = c("=", "<=", "="),
  values = list(c("adipose tissue"), "0.01", "UP")
)


and the select list:

In [None]:
expressedAdipose = setQuery(
  select = c("AtlasExpression.gene.symbol",
             "AtlasExpression.condition", 
             "AtlasExpression.expression", 
             "AtlasExpression.pValue", 
             "AtlasExpression.tStatistic", 
             "AtlasExpression.dataSets.name"
  ),
  orderBy = list(c(AtlasExpression.gene.symbol = "ASC")),
  where = adiposeConstraint1
)


Finally run the query and store the results in query_results.

In [None]:
query_results <-  runQuery(im = im, qry = expressedAdipose)

Take a look at the first few lines of the results to ensure they look correct:

In [None]:
head(query_results)

We want to use the set of genes returned by this query in our second query, to find those that interact with the PPARG gene.  Our query_results object includes several columns of data but we can save just the gene symbols in a vector as follows:

In [None]:
genes <- query_results[,"AtlasExpression.gene.symbol"]

Check genes look correct:

In [None]:
head(genes)

The next query in a workflow took the set of genes from above and looked at which of these interact with PPARG. To construct this query we are going to look at modifying a pre-existing template.  Use the getTemplate Query function to load up the template called "geneInteractiongene".  This is the Gene A --> Interaction <-- Gene B template that was used in the user interface demo.

In [None]:
q = getTemplateQuery(im, 'geneInteractiongene')

We want to modify the constraints a little, first to set the "interactors" to our genes set saved above, but also to add organism constraints:

In [None]:
interactAdiposeConstraint = setConstraints(
  paths = c("Gene",
            "Gene.interactions.participant2.symbol", "Gene.interactions.participant2.organism.shortName", "Gene.organism.shortName"),
  operators = c("LOOKUP", "=", "=", "="),
  values = list(c("PPARG"), c(genes), c("H. sapiens"), c("H. sapiens"))
) 


We can then set up the query to use the same select list from the template (q$select) and the constraits we set above:

In [None]:
interactAdiposeQuery = setQuery(
  select = q$select,
  where = interactAdiposeConstraint
) 


Now run the query and save the results in query_results2

In [None]:
query_results2 <- runQuery(im = im, qry = interactAdiposeQuery)

Again, check the results to ensure they look correct:

In [None]:
head(query_results2) 

This time we are interested in the set of genes that show an intercatin with PPARG.  We can grab the set of gene symbols from the results:

In [None]:
interactors <- unique(query_results2[, "Gene.interactions.participant2.symbol"])

Now, instead of looking at associations with diabetes, we are instead going to explore this set of genes that are expressed in adipose tissue and that interact with pparg further, through gene ontology (GO) enrichment.  InterMine databases provide a number of enrichment widgets, for various types of annotations. depending on the individual intermine - for example, as well as gene ontology enrichment, the humanmine datasbe also provides enrichment statistics for protein domains, pathways, publications etc.  For this exercise we will look at GO enrichment.  First, take a look at the widgets that are provided with humanmine:

In [None]:
human.widgets = as.data.frame(getWidgets(im))

subset(human.widgets, widgetType == 'enrichment' & targets == "Gene")

To run the enrichment widget, use the doEnrichment function with the following arguments - note we are running it with the set of interactors we saved above.

In [None]:
GO_enrichResult = doEnrichment(
  im = im,
  ids = interactors,
  widget = "go_enrichment_for_gene"
)


Take a look at the stats returned:

In [None]:
head(GO_enrichResult$data)

One advantage of using interMineR is that you can feed the results of intermine queries into other R packages for further analysis and visualisation.  As a simple example we will take the results of our enrichment above and feed them into a package called GeneAnswers, which allows us to visualise the results.  To make this easier interMineR includes a function "convertToGeneAnswers". First though, we must load up the GeneAnswers package:

In [None]:
library(GeneAnswers)

Now create a GeneAnswers object using the function as follows:

In [None]:
geneanswer_object = convertToGeneAnswers(
  enrichmentResult = GO_enrichResult, 
  geneInput = data.frame(GeneID = as.character(interactors), 
                             stringsAsFactors = FALSE),
  geneInputType = "Gene.symbol",
  annLib = 'org.Hs.eg.db',
  categoryType = "GO.MF"
)


and take a look at this:

In [None]:
summary(geneanswer_object)

The GeneAnswers package come with many functions, but below are three examples of simple plots you can create to visualise your enrichemnt results:

In [None]:
geneAnswersChartPlots(geneanswer_object, 
                      chartType='pieChart',
                      sortBy = 'geneNum',
                      top = 5)

In [None]:
geneAnswersChartPlots(geneanswer_object, 
                      chartType='barPlot',
                      sortBy = 'geneNum',
                      top = 5)


The concept net helps visualise the overlap between gene sets enriched with specified terms (in this case the top 3 GO terms)

In [None]:
geneAnswersConceptNet(geneanswer_object, 
                      colorValueColumn=NULL,
                      centroidSize='correctedPvalue', 
                      output='interactive',
                      catTerm = FALSE,
                      catID = FALSE,
                      showCats = 1:3)
