# Use a standard AMR database with ARIBA
This sections shows you how to run ARIBA on several samples to determine the presence of AMR genes using standard public AMR databases.

Make a new directory `standard` and move into that directory:

In [None]:
mkdir ../standard
cd ../standard

## Reference database

We are going to download and prepare a reference database for use with ARIBA. You can use any one of the [standard reference databases that ARIBA supports](https://github.com/sanger-pathogens/ariba/wiki/Task:-getref)

* ARG-ANNOT. PMID: 24145532
* CARD. PMID: 23650175
* MEGARes PMID: 27899569
* NCBI BioProject: PRJNA313047
* plasmidfinder PMID: 24777092
* resfinder. PMID: 22782487
* VFDB. PMID: 26578559
* SRST2's version of ARG-ANNOT. PMID: 25422674.
* VirulenceFinder PMID: 24574290.

Let's use the CARD database. Run these commands to make an ARIBA database directory called `ariba_card_db`:

In [None]:
ariba getref card out.card
ariba prepareref -f out.card.fa -m out.card.tsv ariba_card_db

## How to run on one sample

To run ARIBA on one of the *Neisseria gonorrhoeae* samples and the card database, you need to supply the database directory we prepared earlier (which we have called `ariba_card_db`) and two sequencing reads files `fastq/ERR1067813_1.fastq.gz`, `fastq/ERR1067813_2.fastq.gz`.

The command to run ARIBA is:

In [None]:
ariba run ariba_card_db ../fastq/ERR1067813_1.fastq.gz ../fastq/ERR1067813_2.fastq.gz ariba.ERR1067813

The above command will make a new directory called `ariba.ERR1067813` that will contain the ariba results.

## Run on all samples

But what if we want to run ARIBA on all 3 samples? This can be done with a "for" loop. We assume that the reads files are named like this:

```
ERR1067813_1.fastq.gz ERR1067813_2.fastq.gz
ERR1067814_1.fastq.gz ERR1067814_2.fastq.gz
ERR1067815_1.fastq.gz ERR1067815_2.fastq.gz
```

Then we can run ARIBA on all samples like this (you may need to edit this command depending on how your own files are named):

In [None]:
for sample in `ls ../fastq/*_1.fastq.gz | sed 's/..\/fastq\///' | sed 's/\_1.fastq.gz//'`
do
    ariba run ariba_card_db ../fastq/${sample}_1.fastq.gz ../fastq/${sample}_2.fastq.gz ariba.${sample}
done

The output directory of each sample is called `ariba.$sample`, for example ariba.ERR1067813 is the output directory for sample ERR1067813.

## ARIBA output

While you are waiting for ARIBA to run on all 3 samples, go to the [ARIBA wiki (https://github.com/sanger-pathogens/ariba/wiki/Task:-run](https://github.com/sanger-pathogens/ariba/wiki/Task:-run) and read about the ARIBA output and what each of the columns in the report.tsv file mean.

## Summarising the results

Now gather together the results

In [None]:
ariba summary all_results ariba.*/report.tsv

Look at the files produced by ariba summary:

In [None]:
ls

You should see 3 files:

`all_results.csv`  `all_results.phandango.csv`  `all_results.phandango.tre`

Now look at the file all_results.csv and answer the questions below.

## Exercises
1. Which AMR genes are present in all 3 samples?
2. Which AMR genes are absent in sample ERR1067813 but present in the other two samples?
3. Which AMR genes are absent in sample ERR1067814 but present in the other two samples?
4. Which AMR genes are absent in sample ERR1067815 but present in the other two samples?

Now go to the next part of the tutorial where we [prepare a custom reference database for ARIBA](make_custom_db.ipynb).