# Co-occurrence networks of different day periods

Instead of analyzing the networks of different maize subpopulations (which is complex and involves not very clear separation between the distinct groups), it makes more sense to analyze the differences between day and night samples. We hypothesize that day samples do not differ significantly from night samples, therefore networks should be very similar.

In our work, we have different datasets derived from the same original OTU table:
 * OTU table with merged day and night samples, and only those samples that have a pair with the RNA-Seq
 * OTU table with merged day and night samples, and only those samples that have a pair with the RNA-Seq (filtered by relative abundance)

In this notebook, I (RACS) will filter from the original OTU table all the day and night samples, regardless of having or not a transcriptome pair.

In [None]:
import pandas as pd

original_otu_df = pd.read_csv("/home/santosrac/Projects/UGA_RACS/16S/otu_matrices/original_counts/2f_otu_table.sample_filtered.no_mitochondria_chloroplast.tsv",
            sep='\t', index_col=0, dtype={'OTU_ID': str})
original_otu_df.head()

In [12]:
from bioinfokit.analys import norm

# Normalize the data (CPM)
nm = norm()
nm.cpm(df=original_otu_df)
otu_cpm_df = nm.cpm_norm

# Normalize the data (relative abundance)
otu_relabund_df = original_otu_df.divide(original_otu_df.sum())
otu_relabund_df = otu_relabund_df * 100

In [15]:
otus_tokeep = otu_relabund_df[(otu_relabund_df > 0.001).sum(axis=1) >= (otu_relabund_df.shape[1] * 0.5)].index

In [19]:
original_otu_filtered_df = original_otu_df[original_otu_df.index.isin(otus_tokeep)]
original_otu_filtered_df.shape

(356, 540)

In [20]:
original_otu_filtered_night_df = original_otu_filtered_df.filter(like='LMAN')
original_otu_filtered_night_df.shape

(356, 280)

In [21]:
original_otu_filtered_day_df = original_otu_filtered_df.filter(like='LMAD')
original_otu_filtered_day_df.shape

(356, 260)

Exporting the tables of filtered day and night samples (counts). Note that I (RACS) filtered only based on the relative abundance, not caring about the coeff. of variation (as in previous analyses).

In [22]:
original_otu_filtered_day_df.to_csv("/home/santosrac/Repositories/maize_microbiome_transcriptomics/poster_presentation_notebooks/pag_2025/original_otu_filtered_day.tsv",
                                    sep='\t')
original_otu_filtered_night_df.to_csv("/home/santosrac/Repositories/maize_microbiome_transcriptomics/poster_presentation_notebooks/pag_2025/original_otu_filtered_night.tsv",
                                    sep='\t')