# Detect prescence/abscence of genes with ARIBA
One of the most basic tasks ARIBA can be used for is to detect the presence or abscence of a set genes. Let us look at an example. But first, let's check that you're in the right place. Type the command below in the terminal window followed by the Enter key:

In [None]:
pwd

It should display something like:

`/home/manager/course_data/amr/data`

Ok, back to the example! We have sequence data for 3 __Neisseria gonorrhoeae__ samples and want to determine if the genes `fitA` and `fbpA` are present or absent in these samples. Let us use ARIBA to determine this from the raw sequence data.

First list the fastq files for the samples.

In [None]:
ls fastq

For the purpose of this exercise, we have created a fasta file containing the sequences of the genes `fitA` and `fbpA`, take a look:

In [None]:
cd basic
ls
cat genes.fasta

Running ARIBA is generally a three stage process:
* prepare a reference database 
* run ariba on each sample
* summmarise the ariba results for all samples

First prepare the reference database:

In [None]:
ariba prepareref -f genes.fasta --all_coding yes out

Next, run ariba for all 3 samples

In [None]:
ariba run out ../fastq/ERR1067813_1.fastq.gz ../fastq/ERR1067813_2.fastq.gz \
ariba_results_ERR1067813

In [None]:
ariba run out ../fastq/ERR1067814_1.fastq.gz ../fastq/ERR1067814_2.fastq.gz \
ariba_results_ERR1067814

In [None]:
ariba run out ../fastq/ERR1067815_1.fastq.gz ../fastq/ERR1067815_2.fastq.gz \
ariba_results_ERR1067815

Next, summarise the results (the `--no_tree` option tells ARIBA not to generate a tree):

In [None]:
ariba summary --no_tree all_samples ariba_results*/report.tsv

Now inspect the results:

In [None]:
cat all_samples.csv

Notice that the original gene names are not present in the summary file, instead we have the cluster names cluster, cluster_1 etc. This is because ARIBA gathers together genes that are similar into clusters.

To determine which genes are in which cluster run:

In [None]:
ariba refquery out cluster cluster
ariba refquery out cluster cluster_1

## Exercises
1. How many samples have the fitA gene?
2. How many samples have the fbpA gene?

Now go to the next part of the tutorial where we [use a standard database with ARIBA to determine AMR](standard_ariba.ipynb).