# Preliminary Exploration of the Environmental Virome of Cuatro Ciénegas Using Kraken2 and Krona


<img src="https://lh3.googleusercontent.com/gps-cs-s/AC9h4nrK77DASdlncAKEZj-u8mLB1tMT6QkQJ3v17eU9LKDq51EBwErvkKHYeIdV0B0a-3xmkCE3YJlXcSLlpB830MGIgaxu3qVSQ4Wlk3qvPMcz8wW3pBXyRk5c7l00wrfG05E53dU=w675-h390-n-k-no" 
     alt="Cuatro Ciénegas Desert" 
     style="width:100%; height:auto; border-radius:8px;">

<p style="text-align:center;"><em>Figure: Landscape of Cuatro Ciénegas, Mexico (source: Google)</em></p>

---
---

## 1. Context

The Cuatro Ciénegas desert, located in northern Mexico, is an ancient oasis characterized by shallow pools hosting remarkable microbial biodiversity, often regarded as a model for primitive ecosystems. As part of a 32-day ecological experiment, two ponds were sampled:

- JC1A: Control pond (unfertilized)

- JP4D: Fertilized pond

These contrasting environments served as the basis for metagenomic analyses aimed at exploring the virome present in each habitat, using tools accessible on limited-resource machines.



---

## 2. Objectives

This initial report aims to:

- Detect viral reads in environmental samples

- Visualize taxonomic profiles using Krona

- Assess the feasibility of conducting such analyses on modest computational resources (8 GB RAM)

---

## 3. Methodology
- Data type: Paired-end Illumina sequencing reads (gzipped FASTQ)
- Database used: kraken2_viral_202504.tgz, a reduced viral database optimized for memory constraints

- The analyses were conducted on paired-end metagenomic datasets originating from a public ecological experiment in Cuatro Ciénegas (Mexico), available via Zenodo. Bioinformatics processing was carried out within a Dockerized environment to ensure reproducibility and resource optimization, particularly in the context of limited RAM capacity (8 GB).

- The pipeline included an initial quality control of the raw sequences, followed by taxonomic classification focused on viral content using Kraken2 with a lightweight viral database adapted to hardware constraints. Results were then visualized interactively using Krona, allowing hierarchical exploration of the detected taxonomic profiles.

- This methodology combines efficiency, scientific rigor, and technical pragmatism qualities that are essential for conducting pathogen bioinformatics research in low-resource environments, as often encountered across the African continent.

---

## 4. Krona Visualization

In [71]:
from IPython.display import IFrame, display_html


In [68]:
iframe1 = '<iframe src="returned_files/krona/JC1A_krona.html" width="400" height="400"></iframe>'
iframe2 = '<iframe src="returned_files/krona/JP4D_krona.html" width="400" height="400"></iframe>'

display_html(iframe1 + iframe2, raw=True)

---

## 4. Result
The following table summarizes the proportions of viral and unclassified reads (“Other Root”) derived from Krona visualizations for the two samples:

| Sample | Viral Proportion | “Other Root” (unclassifiable reads) |
| ------ | ---------------- | ----------------------------------- |
| JC1A   | 0.6%             | 0.0007%                             |
| JP4D   | 0.3%             | 0.005%                              |


Krona visualizations indicate that both environments contain viral signatures, albeit at low abundance.

## 5. Interprétation

- The control sample JC1A shows a higher proportion of viral reads, suggesting a stable microbial community conducive to viral detection.

- The fertilized sample JP4D exhibits a slightly increased proportion of unclassified reads, potentially indicating microbial community shifts or presence of viral taxa not represented in the current database.

- These observations suggest that fertilization may influence viral community composition or disrupt host communities, indirectly affecting the detectability of known viruses.



---

## 6. Limitations and Technical Considerations
| Constraints | Adaptations Implemented |
| ------------ |------------------------|
| Limited RAM (8 GB)	| Use of a reduced viral database (MiniKraken2)
| Incomplete taxonomy coverage |	Acceptance of a small fraction of unclassified reads|
| No direct functional annotation	| Focus on initial taxonomic exploration|



---

## 7. Perspectives

- This introductory report lays the groundwork for:

- A comparative analysis between JC1A and JP4D (Report 2)

- Functional annotation of viral taxa (Report 3)

- Targeted extraction and functional analysis of viral reads (Report 4)

- Methodological discussion on reproducibility under computational constraints (Report 5)

---

## 8. Conclusion

This preliminary exploration of viral diversity across two contrasting microecosystems in Cuatro Ciénegas demonstrates that meaningful biological insights can be obtained using a lightweight, resource-conscious bioinformatics pipeline. Although modest, the differences observed between the fertilized (JP4D) and control (JC1A) samples suggest that environmental disturbances can influence both the detectable virome composition and the proportion of unclassifiable reads.

These findings align with prior research in Cuatro Ciénegas, which has shown that nutrient enrichment alters microbial community structures (Peimbert et al., 2012; Lee et al., 2017), and that the region hosts an exceptionally rich and undercharacterized virome (Desnues et al., 2018). Despite relying on a reduced viral database, this analysis succeeded in capturing ecologically relevant signals consistent with the known dynamics of this unique environment.

This work illustrates my ability to apply bioinformatics tools to ecologically and epidemiologically significant questions, combining technical adaptability, scientific rigor, and critical interpretation. It aligns directly with the mission of the African STARS MSc program, which aims to empower the next generation of scientists to tackle infectious disease challenges through genomic research particularly in resource constrained settings. Through this training, I aspire to deepen my expertise and contribute meaningfully to pathogen surveillance and genomic research across Africa.

---

## References
[1] Lee, Z. M. P., Steger, C. E., Corman, J. R., Neveu, M., Poret-Peterson, A. T., Souza, V., & Shade, A. (2017). Nutrient stoichiometry shapes microbial diversity in an oligotrophic desert oasis. Frontiers in Microbiology, 8, 1425. https://doi.org/10.3389/fmicb.2017.01425

[2] Peimbert, M., Alcaraz, L. D., Bonilla-Rosso, G., Olmedo-Álvarez, G., García-Oliva, F., Segovia, L., ... & Eguiarte, L. E. (2012). Comparative metagenomics of two microbial mats at Cuatro Ciénegas Basin II: community structure and composition in oligotrophic environments. Astrobiology, 12(7), 659–673. https://doi.org/10.1089/ast.2011.0690

[3] Desnues, C., Rodriguez-Brito, B., Rayhawk, S., Kelley, S., Tran, T., Haynes, M., ... & Rohwer, F. (2008). Biodiversity and biogeography of phages in modern stromatolites and thrombolites. Nature, 452(7185), 340–343. https://doi.org/10.1038/nature06735

[4] Wood, D. E., Lu, J., & Langmead, B. (2019). Improved metagenomic analysis with Kraken 2. Genome Biology, 20, 257. https://doi.org/10.1186/s13059-019-1891-0

[5] Ondov, B. D., Bergman, N. H., & Phillippy, A. M. (2011). Interactive metagenomic visualization in a web browser. BMC Bioinformatics, 12, 385. https://doi.org/10.1186/1471-2105-12-385

[6] Zenodo. (2023). Cuatro Ciénegas metagenomic sequencing data [Data set]. https://zenodo.org/record/7871630

[7] Andrews, S. (2010). FastQC: A Quality Control Tool for High Throughput Sequence Data. https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

[8] Johns Hopkins University, Center for Computational Biology. (2019). Kraken2: Metagenomic classification. https://ccb.jhu.edu/software/kraken2/

[9] Ondov, B. (2016). Krona Tools for Metagenomic Visualization. https://github.com/marbl/Krona/wiki

---

<hr style="margin-top: 50px;">

<footer style="text-align: center; font-size: small; color: gray;">
  © 2025 Mbock Mbock Georges Christian – All rights reserved.  
  <br>Notebook produced as part of an application for the MSc in Bioinformatics of Infectious Diseases and Pathogen Genomics – Stellenbosch University (African STARS Program).
</footer>


---