## RNA-Seq Differential Expression Analysis

This notebook analyzes the transcriptomic changes in the lungs of K18-hACE2 mice treated with fusion-inhibitory lipopeptides compared to untreated controls.

In [None]:
# Import necessary libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
import plotly.express as px

### Load Differential Expression Data

Load the RNA-Seq count data and perform normalization and differential expression analysis.

In [None]:
# Load data
counts = pd.read_csv('GSE223056_counts.csv', index_col=0)
conditions = pd.read_csv('GSE223056_conditions.csv')

# Normalize data (simple CPM)
counts_cpm = counts.div(counts.sum(axis=1), axis=0) * 1e6

# Log2 Transformation
log2_cpm = np.log2(counts_cpm + 1)

### Identify Differentially Expressed Genes

Using a simple statistical test to identify genes with significant expression changes.

In [None]:
# Define groups
treated = conditions[conditions['Group'] == 'Peptide']
untreated = conditions[conditions['Group'] == 'Control']

# Perform t-test
p_values = log2_cpm.apply(lambda x: stats.ttest_ind(x[treated.index], x[untreated.index])[1])

# Adjust p-values
from statsmodels.stats.multitest import multipletests
adjusted_p = multipletests(p_values, method='fdr_bh')[1]

# Create results dataframe
results = pd.DataFrame({'p-value': p_values, 'adjusted p-value': adjusted_p})
results['log2 fold change'] = log2_cpm.mean(axis=1)[treated.index].values - log2_cpm.mean(axis=1)[untreated.index].values

# Filter significant genes
sig_genes = results[(results['adjusted p-value'] < 0.05) & (abs(results['log2 fold change']) > 1)]

### Visualization of Differentially Expressed Genes

Generate a volcano plot to visualize the differentially expressed genes.

In [None]:
# Volcano plot
fig = px.scatter(results, x='log2 fold change', y=-np.log10(results['adjusted p-value']),
                 title='Volcano Plot of Differentially Expressed Genes',
                 labels={'log2 fold change': 'Log2 Fold Change', '-log10(adjusted p-value)': '-Log10 Adjusted P-Value'},
                 hover_data=results.index)
fig.show()

### Heatmap of Top Differentially Expressed Genes

Visualize the expression levels of the top significant genes across all samples.

In [None]:
# Heatmap
top_genes = sig_genes.nsmallest(20, 'adjusted p-value').index
plt.figure(figsize=(10, 8))
sns.heatmap(log2_cpm.loc[top_genes], cmap='viridis', annot=True)
plt.title('Heatmap of Top Differentially Expressed Genes')
plt.show()

### Conclusion

The analysis identifies key genes modulated by fusion-inhibitory lipopeptide treatment, providing insights into the molecular mechanisms of protective immunity against SARS-CoV-2.





***
### [**Evolve This Code**](https://biologpt.com/?q=Evolve%20Code%3A%20Analyzes%20RNA-Seq%20data%20to%20identify%20differentially%20expressed%20genes%20in%20peptide-treated%20versus%20control%20mice.%0A%0AIntegrate%20DESeq2%20for%20more%20robust%20differential%20expression%20analysis%20and%20include%20pathway%20enrichment%20to%20understand%20biological%20implications.%0A%0AIntranasal%20lipopeptides%20SARS-CoV-2%20immunity%0A%0A%23%23%20RNA-Seq%20Differential%20Expression%20Analysis%0A%0AThis%20notebook%20analyzes%20the%20transcriptomic%20changes%20in%20the%20lungs%20of%20K18-hACE2%20mice%20treated%20with%20fusion-inhibitory%20lipopeptides%20compared%20to%20untreated%20controls.%0A%0A%23%20Import%20necessary%20libraries%0Aimport%20pandas%20as%20pd%0Aimport%20numpy%20as%20np%0Aimport%20matplotlib.pyplot%20as%20plt%0Aimport%20seaborn%20as%20sns%0Afrom%20scipy%20import%20stats%0Aimport%20plotly.express%20as%20px%0A%0A%23%23%23%20Load%20Differential%20Expression%20Data%0A%0ALoad%20the%20RNA-Seq%20count%20data%20and%20perform%20normalization%20and%20differential%20expression%20analysis.%0A%0A%23%20Load%20data%0Acounts%20%3D%20pd.read_csv%28%27GSE223056_counts.csv%27%2C%20index_col%3D0%29%0Aconditions%20%3D%20pd.read_csv%28%27GSE223056_conditions.csv%27%29%0A%0A%23%20Normalize%20data%20%28simple%20CPM%29%0Acounts_cpm%20%3D%20counts.div%28counts.sum%28axis%3D1%29%2C%20axis%3D0%29%20%2A%201e6%0A%0A%23%20Log2%20Transformation%0Alog2_cpm%20%3D%20np.log2%28counts_cpm%20%2B%201%29%0A%0A%23%23%23%20Identify%20Differentially%20Expressed%20Genes%0A%0AUsing%20a%20simple%20statistical%20test%20to%20identify%20genes%20with%20significant%20expression%20changes.%0A%0A%23%20Define%20groups%0Atreated%20%3D%20conditions%5Bconditions%5B%27Group%27%5D%20%3D%3D%20%27Peptide%27%5D%0Auntreated%20%3D%20conditions%5Bconditions%5B%27Group%27%5D%20%3D%3D%20%27Control%27%5D%0A%0A%23%20Perform%20t-test%0Ap_values%20%3D%20log2_cpm.apply%28lambda%20x%3A%20stats.ttest_ind%28x%5Btreated.index%5D%2C%20x%5Buntreated.index%5D%29%5B1%5D%29%0A%0A%23%20Adjust%20p-values%0Afrom%20statsmodels.stats.multitest%20import%20multipletests%0Aadjusted_p%20%3D%20multipletests%28p_values%2C%20method%3D%27fdr_bh%27%29%5B1%5D%0A%0A%23%20Create%20results%20dataframe%0Aresults%20%3D%20pd.DataFrame%28%7B%27p-value%27%3A%20p_values%2C%20%27adjusted%20p-value%27%3A%20adjusted_p%7D%29%0Aresults%5B%27log2%20fold%20change%27%5D%20%3D%20log2_cpm.mean%28axis%3D1%29%5Btreated.index%5D.values%20-%20log2_cpm.mean%28axis%3D1%29%5Buntreated.index%5D.values%0A%0A%23%20Filter%20significant%20genes%0Asig_genes%20%3D%20results%5B%28results%5B%27adjusted%20p-value%27%5D%20%3C%200.05%29%20%26%20%28abs%28results%5B%27log2%20fold%20change%27%5D%29%20%3E%201%29%5D%0A%0A%23%23%23%20Visualization%20of%20Differentially%20Expressed%20Genes%0A%0AGenerate%20a%20volcano%20plot%20to%20visualize%20the%20differentially%20expressed%20genes.%0A%0A%23%20Volcano%20plot%0Afig%20%3D%20px.scatter%28results%2C%20x%3D%27log2%20fold%20change%27%2C%20y%3D-np.log10%28results%5B%27adjusted%20p-value%27%5D%29%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20title%3D%27Volcano%20Plot%20of%20Differentially%20Expressed%20Genes%27%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20labels%3D%7B%27log2%20fold%20change%27%3A%20%27Log2%20Fold%20Change%27%2C%20%27-log10%28adjusted%20p-value%29%27%3A%20%27-Log10%20Adjusted%20P-Value%27%7D%2C%0A%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20%20hover_data%3Dresults.index%29%0Afig.show%28%29%0A%0A%23%23%23%20Heatmap%20of%20Top%20Differentially%20Expressed%20Genes%0A%0AVisualize%20the%20expression%20levels%20of%20the%20top%20significant%20genes%20across%20all%20samples.%0A%0A%23%20Heatmap%0Atop_genes%20%3D%20sig_genes.nsmallest%2820%2C%20%27adjusted%20p-value%27%29.index%0Aplt.figure%28figsize%3D%2810%2C%208%29%29%0Asns.heatmap%28log2_cpm.loc%5Btop_genes%5D%2C%20cmap%3D%27viridis%27%2C%20annot%3DTrue%29%0Aplt.title%28%27Heatmap%20of%20Top%20Differentially%20Expressed%20Genes%27%29%0Aplt.show%28%29%0A%0A%23%23%23%20Conclusion%0A%0AThe%20analysis%20identifies%20key%20genes%20modulated%20by%20fusion-inhibitory%20lipopeptide%20treatment%2C%20providing%20insights%20into%20the%20molecular%20mechanisms%20of%20protective%20immunity%20against%20SARS-CoV-2.%0A%0A)
***

### [Created with BioloGPT](https://biologpt.com/?q=Paper%20Review%3A%20Intranasally%20administrated%20fusion-inhibitory%20lipopeptides%20block%20SARS-CoV-2%20infection%20in%20mice%20and%20enable%20long-term%20protective%20immunity)
[![BioloGPT Logo](https://biologpt.com/static/icons/bioinformatics_wizard.png)](https://biologpt.com/)
***