This notebook processes the dataset extracted from the Ibadan water study to perform hierarchical clustering of isolates based on resistance profiles.

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Load data from local CSV file (data from study)
data = pd.read_csv('ibadan_ecoli_data.csv')

# Create a pivot table for visualization
pivot_table = data.pivot('isolate_id', 'season', 'MDR')

# Plot heatmap
plt.figure(figsize=(8,6))
sns.heatmap(pivot_table, annot=True, cmap='viridis')
plt.title('MDR Profiles of E. coli Isolates by Season')
plt.show()

This code visualizes the clustering of isolates based on their resistance phenotypes, allowing us to correlate seasonal effects with MDR prevalence.

In [None]:
# Additional step: hierarchical clustering
from scipy.cluster.hierarchy import linkage, dendrogram

# Assuming the dataset contains multiple resistance gene features
features = data[['resistance_gene1', 'resistance_gene2', 'resistance_gene3']]
linked = linkage(features, 'ward')

dendrogram(linked, labels=data['isolate_id'].values, orientation='top')
plt.title('Hierarchical Clustering of Isolates')
plt.xlabel('Isolate ID')
plt.ylabel('Distance')
plt.show()

The clustering analysis helps explore genetic similarities among isolates and highlights potential clonal clustering related to seasonal changes.

In [None]:
# Final analysis: correlating plasmid replicon presence with MDR profiles
import numpy as np

# Calculate frequency of replicon types in MDR vs non-MDR isolates
grouped = data.groupby('MDR')['plasmid_replicon'].apply(lambda x: x.value_counts())
print(grouped)





***
### [**Evolve This Code**](https://biologpt.com/?q=Evolve%20Code%3A%20This%20code%20downloads%20the%20dataset%20of%20E.%20coli%20isolates%20and%20performs%20clustering%20analysis%20to%20visualize%20resistance%20gene%20distributions%20across%20seasons.%0A%0AInclude%20more%20resistance%20gene%20features%20and%20time-series%20data%20to%20enhance%20the%20clustering%20analysis%20and%20integrate%20statistical%20testing.%0A%0AGenetic%20antimicrobial%20resistance%20Escherichia%20coli%20household%20water%20Ibadan%20Nigeria%0A%0AThis%20notebook%20processes%20the%20dataset%20extracted%20from%20the%20Ibadan%20water%20study%20to%20perform%20hierarchical%20clustering%20of%20isolates%20based%20on%20resistance%20profiles.%0A%0Aimport%20pandas%20as%20pd%0Aimport%20seaborn%20as%20sns%0Aimport%20matplotlib.pyplot%20as%20plt%0A%0A%23%20Load%20data%20from%20local%20CSV%20file%20%28data%20from%20study%29%0Adata%20%3D%20pd.read_csv%28%27ibadan_ecoli_data.csv%27%29%0A%0A%23%20Create%20a%20pivot%20table%20for%20visualization%0Apivot_table%20%3D%20data.pivot%28%27isolate_id%27%2C%20%27season%27%2C%20%27MDR%27%29%0A%0A%23%20Plot%20heatmap%0Aplt.figure%28figsize%3D%288%2C6%29%29%0Asns.heatmap%28pivot_table%2C%20annot%3DTrue%2C%20cmap%3D%27viridis%27%29%0Aplt.title%28%27MDR%20Profiles%20of%20E.%20coli%20Isolates%20by%20Season%27%29%0Aplt.show%28%29%0A%0AThis%20code%20visualizes%20the%20clustering%20of%20isolates%20based%20on%20their%20resistance%20phenotypes%2C%20allowing%20us%20to%20correlate%20seasonal%20effects%20with%20MDR%20prevalence.%0A%0A%23%20Additional%20step%3A%20hierarchical%20clustering%0Afrom%20scipy.cluster.hierarchy%20import%20linkage%2C%20dendrogram%0A%0A%23%20Assuming%20the%20dataset%20contains%20multiple%20resistance%20gene%20features%0Afeatures%20%3D%20data%5B%5B%27resistance_gene1%27%2C%20%27resistance_gene2%27%2C%20%27resistance_gene3%27%5D%5D%0Alinked%20%3D%20linkage%28features%2C%20%27ward%27%29%0A%0Adendrogram%28linked%2C%20labels%3Ddata%5B%27isolate_id%27%5D.values%2C%20orientation%3D%27top%27%29%0Aplt.title%28%27Hierarchical%20Clustering%20of%20Isolates%27%29%0Aplt.xlabel%28%27Isolate%20ID%27%29%0Aplt.ylabel%28%27Distance%27%29%0Aplt.show%28%29%0A%0AThe%20clustering%20analysis%20helps%20explore%20genetic%20similarities%20among%20isolates%20and%20highlights%20potential%20clonal%20clustering%20related%20to%20seasonal%20changes.%0A%0A%23%20Final%20analysis%3A%20correlating%20plasmid%20replicon%20presence%20with%20MDR%20profiles%0Aimport%20numpy%20as%20np%0A%0A%23%20Calculate%20frequency%20of%20replicon%20types%20in%20MDR%20vs%20non-MDR%20isolates%0Agrouped%20%3D%20data.groupby%28%27MDR%27%29%5B%27plasmid_replicon%27%5D.apply%28lambda%20x%3A%20x.value_counts%28%29%29%0Aprint%28grouped%29%0A%0A)
***

### [Created with BioloGPT](https://biologpt.com/?q=Paper%20Review%3A%20Genetic%20basis%20for%20antimicrobial%20resistance%20in%20Escherichia%20coli%20isolated%20from%20household%20water%20in%20municipal%20Ibadan%2C%20Nigeria)
[![BioloGPT Logo](https://biologpt.com/static/icons/bioinformatics_wizard.png)](https://biologpt.com/)
***