# <center>Exploring Antimicrobial Resistance (AMR) genes within wild and domestic animal populations</center>


## Publication

Skar≈ºy≈Ñska M, Leekitcharoenphon P, Hendriksen RS, Aarestrup FM, Wasyl D (2020) A metagenomic glimpse into the gut of wildand domestic animals: Quantification of antimicrobial resistance and more. PLoS ONE 15(12):e0242987. https://doi.org/10.1371/journal.pone.0242987

## Abstract

Antimicrobial resistance (AMR) in bacteria is a complex subject, why one need to look at this phenomenon from a wider and holistic perspective. The extensive use of the same antimicrobial classes in human and veterinary medicine as well as horticulture is one of the main drivers for the AMR selection. Here, we applied shotgun metagenomics to investigate the AMR epidemiology in several animal species including farm animals, which are often exposed to antimicrobial treatment opposed to an unique set of wild animals that seems not to be subjected to antimicrobial pressure. The comparison of the domestic and wild animals allowed to investigate the possible anthropogenic impact on AMR spread. Inclusion of animals with different feeding behaviors (carnivores, omnivores) enabled to further assess which AMR genes that thrives within the food chain. We tested fecal samples not only of intensively produced chickens, turkeys, and pigs, but also of wild animals such as wild boars, red foxes, and rodents. A multi-directional approach mapping obtained sequences to several databases provided insight into the occurrence of the different AMR genes. The method applied enabled also analysis of other factors that may influence AMR of intestinal microbiome such as diet. Our findings confirmed higher levels of AMR in farm animals than in wildlife. The results also revealed the potential of wildlife in the AMR dissemination. Particularly in red foxes, we found evidence of several AMR genes conferring resistance to critically important antimicrobials like quinolones and cephalosporins. In contrast, the lowest abundance of AMR was observed in rodents originating from natural environment with presumed limited exposure to antimicrobials. Shotgun metagenomics enabled us to demonstrate that discrepancies between AMR profiles found in the intestinal microbiome of various animals probably resulted from the different antimicrobial exposure, habitats, and behavior of the tested animal species.


## Assignment Overview 

This exercise explores the diversity of antimicrobial (AMR) genes in wild and domestic animal populations, using metagenomic samples from [Skar≈ºy≈Ñska et al. (2020)]( https://doi.org/10.1371/journal.pone.0242987) as a real-world dataset. Broken into three parts, the first two replicate select analyses on the six metagenomes from the original publication using the BV-BRC resources. The [Taxonomic Classification service](https://www.bv-brc.org/app/TaxonomicClassification) will first be used for quality control of raw read files and the review of each species‚Äô gut microbial composition. Second, the [Metagenomic Read Mapping service](https://www.bv-brc.org/app/MetagenomicReadMapping) aligns reads to the Comprehensive Antibiotic Resistance Database (CARD), providing you the ability to observe the presence and abundance of AMR genes within each species' gut microbiome. Lastly, you will apply these newly developed skills on a publicly-available metagenomic sample, from either a domestic or wild animal species of your choosing, for comparison to the original dataset and interpretation of your results in the context of diet, environment, and frequency of exposure to antimicrobials.

## <center>üìù Learning Objectives</center> 

* Utilizing BV-BRC bioinformatic resources to learn the basic concepts of a bioinformatic workflow.
  
* Analyzing the quality of metagenomic sequence data.
  
* Performing taxonomic classification of metagenomic sequence data and producing informative figures.
  
* Understanding how to use publically-available databases to classify AMR genes within a metagenome.
  
* Build confidence in our ability to use bioinformatic tools to address real-world questions.


## <center>Sequence Files</center> 

| Sample Name | Population Type | Read File Name | Read Type |
|-------------|---------------|---------------|-----------|
| Boars | Wild | Boars_classified_reads_R1.fastq.gz | Forward |
||| Boars_classified_reads_R2.fastq.gz | Reverse |
| Chickens | Domestic | Chickens_classified_reads_R1.clean_1.fastq.gz | Forward |
||| Chickens_classified_reads_R1.clean_2.fastq.gz | Reverse |
| Pigs | Domestic | Pigs_classified_reads_R1.fastq.gz | Forward |
||| Pigs_classified_reads_R2.fastq.gz | Reverse |
| Foxes | Wild | Foxes_classified_reads_R1.clean_1.fastq.gz | Forward |
||| Foxes_classified_reads_R1.clean_2.fastq.gz | Reverse |
| Turkeys | Domestic | Turkeys_classified_reads_R1.fastq.gz | Forward |
||| Turkeys_classified_reads_R2.fastq.gz | Reverse |
| Rodents | Wild | Rodents_classified_reads_R1.clean_1.fastq.gz | Forward |
||| Rodents_classified_reads_R1.clean_2.fastq.gz | Reverse |


## <center>Getting Started: Sign into the BV-BRC and access exercise material</center>

At this point in time, you should have already created an account on the [BV-BRC website](https://www.bv-brc.org/). If you haven't, please do so now!

You can access all the necessary material through this [workspace](https://www.bv-brc.org/workspace/jsheriff@bvbrc/BIOS%20450%20-%20Advanced%20Microbiology%20-%20Univ.%20of%20Illinois%20at%20Chicago/BIOS%20450%20-%20Advanced%20Microbiology%20-%20Univ.%20of%20Illinois%20at%20Chicago). You must be signed in **first** to access the workspace. Please let us know if any issues arise.


### Copying exercise material to home workspace

While you can access publicly-available workspaces, you should copy the "Exercise Material" folder over to your own workspace **first** before beginning your analyses. You can do this by clicking the folder and selecting "Copy" on the sidebar found on the right-hand side of your screen.

<center><div style="max-width:1000px">
    
![image.png](attachment:image.png)
    
</div></center>

**Next**, you will need to select which folder/workspace you would like as the destination for the copied folder. This destination could be your home workspace, or a workspace you created for this assignment.

<center><div style="max-width:1000px">
    
![image.png](attachment:image-4.png)
    
</div></center>

If you decide not to choose any folder shown on the Copy menu, the copied folder will automatically be placed in your home workspace.

<center><div style="max-width:1000px">
    
![image.png](attachment:image-3.png)
    
</div></center>

# <center>Part 1: Taxonomic Classification Service (TCS)</center>

The BV-BRC [Taxonomic Classification Service](https://www.bv-brc.org/app/TaxonomicClassification) is a useful tool for exploring the microbial composition of metagenomic samples. With this, we can compare the relative abundance of taxa accounting for at least 1% of the total read hits between the various domestic and wild animal populations within the study. Additionally, the TCS will return the quality control metrics for the raw reads of each sample and provide us insight into the structure of each microbial community based on alpha and beta-diversity metrics. 

<center><div style="max-width:900px">
    
![image.png](attachment:27ff94b5-5a41-47d2-a87e-4b4fefb15cfd.png)

</div></center>

#### Input File 

To use this service, we will use the paired-end FASTQ files provided within the **Sample_Data** directory of the exercise's workspace. To upload your data, select the folder next to the **READ FILE 1** dropdown menu and navigate to the **Classified_Sample_Reads** directory. For read file 1, you should select the R1 FASTQ file. Similarly, for read file 2, you should select the R2 FASTQ file. 

<center><div style="max-width:800px">
    
![image.png](attachment:7bf70977-626a-494c-8140-b63b56a5deae.png)

</div></center>

Once you have given your data a unique sample identifier name, click the **arrow** in the top-right corner to add your input file to the **Selected Libraries** list.

#### Parameters

<center><div style="max-width:800px">
    
![Screenshot 2025-12-03 at 3.13.18‚ÄØPM.png](attachment:0d5e257b-8811-4d3f-bbd9-7cf98d95d1fe.png)

</div></center>

Adjusting the parameters prior to submitting the job allows us to control **how** the service should analyze our samples. Since we are working with metagenomic samples. **Whole Genome Sequencing** should be selected under **Sequencing Type**. We will perform a **Microbiome Analysis** and will use the **BV-BRC Database** as our reference database. Filtering host reads is optional, but it may be useful to filter our **Homo sapien** reads from your dataset. The **Confidence Interval** should be set to **0.1**, and the **Output Folder** and **Output Name** should be selected appropriately. Hit submit once complete.

### Job Results

The status of your job submmission can be viewed by either clicking the **Jobs** tab in the bottom-right corner of your screen or by clicking **My Jobs** under the **Workspaces** drop-down menu. 

<center><div style="max-width:1000px">
    
![image.png](attachment:feaff0fb-4236-445d-9426-1fbf825a6ae6.png)

</div></center>

#### Raw read quality scores

While the FASTQ utilities service is a tool developed specifically for the processing of raw sequencing data, the quality control metrics for each sample can also be viewed within their respective **folder**. After accessing the sample folder, you can view the QC reports by selecting the **fastqc_results** folder and choosing either the **host_removed_reads** or the **raw_reads** folder. All job results that can be viewed directly within your web browser will be saved under a **.html** extension. 

<center><div style="max-width:1000px">

![image.png](attachment:55eb1d19-92d1-433a-83a0-651d835c4f3c.png)

</div></center>

Before getting too deep into your analysis, it's important to first ensure the quality of your data is good enough to provide you with reliable and accurate insights into your microbiome sample. To do this, you should view the **per base sequence quality** graphic for your forward and reverse (e.g., R1 and R2) reads.  

This report gives you the average (the blue line), median (the red line), and overall distribution (the yellow box plot) of quality scores (the y-axis) for all of the reads within your files at each nucleotide position (the x-axis). Having a "good" quality score, or being within the "green zone" of values higher than 28, can be interpreted as a nucleotide position having a low probability of representing a sequencing error (a good thing!). It is normal for quality scores to drop slightly near the end of your reads, but as long as the **average** quality scores remain above 28, there is no need to make any adjustments to your raw reads before continuing your bioinformatic workflow. 

## <center>&#128187; **Exercise 1: TCS Data Visualization**</center>

1) To begin, run the **Taxonomic Classification Service** on the six samples listed above, using their SRA accession numbers to upload the raw data files. The run parameters should be the same as shown above.
> Estimated Run Time: 5 hours 

2) After the job is completed, raw read quality scores for each sample are provided within their respective folders.
> a) Access the **Wild Fox** folder and find the **FastQC Reports** for the forward (R1) and reverse (R2) reads. Submit the **per base sequence quality** for both read files (two graphs in total). You can save these images by either right-clicking and selecting "Save Image As" or by taking a screenshot of your monitor. 

3) While there are a multitude of informative outputs provided by the TCS, the **Taxonomic-Classification-Service-BVBRC_multiqc_report.html** file contains interactive plots and diagrams used to illustrate the taxonomic composition, quality, and general statistics of your metagenomic samples.
> a) Using the results from the **Bracken** computational tool, create a **stacked bar graph** that illustrates the **percent abundance** (or the relative abundance) of the top 5 **phyla** for each sample (six graphs in total!). You can save each graph by using the **Export Plot** button to download your graph as a PNG file.
> 
> b) For **each sample**, identify the most abundant **bacterial** phylum and include its **percent abundance**.
>> **NOTE**: *Chordata* is not a bacterial phylum!

## <center>Part 2: Metagenomic Read Mapping Service (MRMS)</center>

The MRMS is a valuable resource for researchers interested in identifying antimicrobial resistance genes present within metagenomic samples. To do this, the MRMS uses [k-mer alignment](https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2336-6) to align sequence data to reference genes within the [Comprehensive Antibiotic Resistance Database (CARD)](https://pubmed.ncbi.nlm.nih.gov/31665441/). Not only can this service determine the number of different AMR genes present within a metagenomic sample, but it can also provide insights into the abundance of each gene based on the mapped sequencing **depth** (i.e., the number of reads that align with a specific AMR reference gene).

#### Input File 

<center><div style="max-width:1000px">

![Screenshot 2025-12-03 at 3.31.15‚ÄØPM.png](attachment:c4a55aa2-54e7-4852-92dd-6d098135322a.png)

</div></center>

Same as the Taxonomic Classification Service, we will use the paired-end FASTQ files provided within the **Sample_Data** directory of the exercise's workspace. To upload your data, select the folder next to the **READ FILE 1** dropdown menu and navigate to the **Classified_Sample_Reads** directory. For read file 1, you should select the R1 FASTQ file. Similarly, for read file 2, you should select the R2 FASTQ file. 

However, an MRMS job submission should only consist of **one** sample at a time since this service will survey all metagenomic reads collectively from every sample listed within the **Selected Libraries**.

#### Parameters

<center><div style="max-width:1000px">

![image.png](attachment:0dc9c6c7-3b41-4652-812c-8822e9471485.png)

</div></center>

For the parameters, **Predefined List** should be indicated under **Gene Set Type**, and **CARD** should be chosen as the **Predefined Gene Set Name**. Same as the previous analysis, **Output Folder** and **Output Name** should be selected appropriately. Hit submit once complete.

### Job Result

MRMS outputs can be found within their respective output folders. To view the list of AMR genes identified within your sample, select the **kma.res** file.

<center><div style="max-width:1000px">

![image.png](attachment:6b23cff1-fb05-4978-aacb-995c527b612b.png)

</div></center>

However, we are unable to view this output in its current format. To make the file viewable within your web browser, select the file and change the file **type** to **tsv** from the dropdown list. Afterwards, click **Save** and select the **View** icon to open the document.

<center><div style="max-width:900px">

![image.png](attachment:99f670f9-fbf6-4ff8-a902-e074f5691795.png)

</div></center>

Above your data sheet, click on **First Row Contains Column Headers**. You will now be able to view your MRMS results, and you can reorder the data sheet by selecting the **column header** you would like to reorder by. 


## <center>&#128187; **Exercise 2: Comparing the AMR gene population between wild and domestic animals**</center>

1) To begin, run the **Metagenomic Read Mapping Service** on the six samples listed above, using their SRA accession numbers to upload the raw data files. The run parameters should be the same as shown above.
> Estimated Run Time: 30 minutes - 1 hour

<center><div style="max-width:1000px">

![image.png](attachment:6c0560e6-3b57-456b-aa1e-5d27466f0661.png)

</div></center>

2) Now that we have completed our analysis, we can begin to address our initial question: How does the AMR gene population within the gut microbiome of domestic animals differ from what's found within wild animal populations? The figure above illustrates the AMR gene richness (A-B), bacterial richness (C-D), and the correlation between AMR richness and bacterial richness (E) of our wild and domestic animal populations. 
> a) Briefly, describe the difference between species richness and species diversity.
> 
> b) Which population type (domestic or wild) has the higher AMR gene richness? 
> 
> c) Which population type has a higher bacterial richness?
> 
> d) Based on Figure E, we can see a positive correlation between AMR gene richness with bacterial richness. If we were to view these gut microbiomes as a community of organisms **competing** with one another for resources, in 2-3 sentences, explain how a higher AMR gene richness may help increase the richness of a microbiome.
>
> e) Conversely, how may a higher bacterial richness increase the number of AMR genes present within a microbiome?
>
>> **NOTE**: Feel free to use additional resources, or the original paper, to help you answer these questions. Be sure to include a reference for any resource you reference.


3) For the final part of this exercise, you will need to access the job outputs from your six MRM runs. Find the **kma.res** file for each of your samples, and follow the instructions described above to view the document within your web browser. Be sure to reorder each list by the **Depth** column, so that the most abundant AMR gene is listed at the top.

> a) For each animal, identify their most abundant AMR gene. For each AMR gene, describe which antibiotic it gives resistance to, and provide a brief description of the mechanism of the AMR gene.
>
> b) Based on Figure A, we see that chickens and foxes have the greatest AMR gene richness for their respective population type. Find the top AMR gene for both animals, and provide possible reasons for **why** resistance to those antibiotics may have been conferred within each of these populations. To do so, you should take into consideration (1) the antibiotic's use within agriculture and (2) possible food resources for each of the two animals. Feel free to use the original publication as a resource, or you may reference additional outside sources. 

## <center>&#128187; **Exercise 3: Putting your knowledge to use: analyzing the AMR gene population of your own metagenome!** </center>

While we were able to replicate the findings of Skar≈ºy≈Ñska et al., one question still remains: are these findings unique to this study, or does a **broader** trend exist between animal population type and the AMR gene population of the host's microbiome?

To address this, you will need to find a publicly-available metagenomic sample from the [NCBI database](https://www.ncbi.nlm.nih.gov/). To conduct your search efficiently, select one of the species from the curated list of animal gut metagenomes used in a study by [Youngblut et al. (2022)](https://www.bv-brc.org/workspace/jsheriff@bvbrc/BIOS450_AMR_Exercise/Exercise%20Material/Youngblutetal2020_SpeciesList_Reduced.csv). Choose one sample from the list (for population types: domestic = captive, wild = wild), and search its **sample name** (listed in the first column) in the [NCBI SRA database](https://www.ncbi.nlm.nih.gov/sra). Select either option from your search (the search should only return two options with the same name), and the **SRA Accession Number** will be found at the bottom of the page, under the **Runs** subheader. 

1) Using the SRR Accession Number, you will analyze the taxonomic composition of your host's microbiome with the Taxonomic Classification Service. You will repeat the same requirements described in **Exercise 1**.
> Once your job run is complete, begin by viewing the **FASTQC Report**. Ensure that the average quality of your reads does **not** drop below 28. If they do, you will need to select a different sample to avoid complications with the interpretation of your results. 

2) Next, you will analyze the AMR gene population of your metagenome using the Metagenomic Read Mapping Service. With your results, you will need to:
> a) Determine the number of AMR genes present within your sample. Once your have made your **kma.res** file viewable and have selected that the **First Row Contains Column Headers**, the number of AMR genes present is equivalent to the number of rows within your data sheet. Does this value align with AMR gene richness of either the wild or domestic animal populations of the original study?
> 
> b) Which AMR gene is the **most abundant** within your metagenome? Similar to **Exercise 2**, you should identify which antibiotic this gene confers resistance to, and briefly describe the mechanism.
>
> c) Lastly, conduct some additional research and write 2-3 sentences on **why** this AMR gene is abundant within your particular organism. While conducting your research, it may be helpful to consider (1) the population type of your organism, (2) if your domestic animal is a household pet or livestock, and (3) the country of origin, since different countries may have differences in the legislation regarding the use of antibiotics.


## Resources

Alcock, B.P., et al., CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic acids research, 2020. 48(D1): p. D517-D525.

Andrews S. (2010). FastQC: a quality control tool for high throughput sequence data. Available online at: http://www.bioinformatics.babraham.ac.uk/projects/fastqc

Clausen, P.T., F.M. Aarestrup, and O. Lund, Rapid and precise alignment of raw reads against redundant databases with KMA. BMC bioinformatics, 2018. 19(1): p. 307.

Skar≈ºy≈Ñska M, Leekitcharoenphon P, Hendriksen RS, Aarestrup FM, Wasyl D(2020) A metagenomic glimpse into the gut of wildand domestic animals: Quantification of antimicrobial resistance and more. PLoS ONE 15(12):e0242987. https://doi.org/10.1371/journal.pone.0242987

Wattam, A. R., Bowers, N., Brettin, T., Conrad, N., Cucinell, C., Davis, J. J., Dickerman, A. W., Dietrich, E. M., Kenyon, R. W., Machi, D., Mao, C., Nguyen, M., Olson, R. D., Overbeek, R., Parrello, B., Pusch, G. D., Shukla, M., Stevens, R. L., Vonstein, V., & Warren, A. S. (2024). Comparative Genomic Analysis of Bacterial Data in BV-BRC: An Example Exploring Antimicrobial Resistance. In J. C. Setubal, P. F. Stadler, & J. Stoye (Eds.), Comparative Genomics (Vol. 2802, pp. 547‚Äì571). Springer US. https://doi.org/10.1007/978-1-0716-3838-5_18

Youngblut, N. D., de la Cuesta-Zuluaga, J., Reischer, G. H., Dauser, S., Schuster, N., Walzer, C., Stalder, G., Farnleitner, A. H., & Ley, R. E. (2020). Large-Scale Metagenome Assembly Reveals Novel Animal-Associated Microbial Genomes, Biosynthetic Gene Clusters, and Other Genetic Diversity. mSystems, 5(6), 10.1128/msystems.01045-20. https://doi.org/10.1128/msystems.01045-20