## Analysis of Satellite DNA Families in Frieseomelitta varia

This notebook analyzes the abundance and distribution of various satDNA families identified in *Frieseomelitta varia*.

In [None]:
import pandas as pd
import plotly.express as px

# Creating DataFrame from extracted data
data = {
    'satDNA_family': ['FvarSat01-306', 'FvarSat02-449', 'FvarSat03-1145', 'FvarSat04-145', 'FvarSat05-131', 'FvarSat06-145', 'FvarSat07-170'],
    'abundance': [9.971, 0.803, 0.245, 0.117, 0.054, 0.019, 0.014],
    'length_bp': [306, 449, 1145, 145, 131, 145, 170],
    'AT_content': [66, 53.2, 58.4, 54.5, 72.5, 59.3, 71.2],
    'divergence': [4.62, 2.92, 7.73, 22.09, 8.67, 18.36, 6.13]
}
df = pd.DataFrame(data)

### Summary Statistics

Let's compute some summary statistics for the abundance of satDNA families.

In [None]:
summary = df['abundance'].describe()
print(summary)

### Abundance of satDNA Families

Visualizing the abundance of each satDNA family.

In [None]:
fig = px.bar(df, x='satDNA_family', y='abundance', title='Abundance of satDNA Families in Frieseomelitta varia', labels={'satDNA_family': 'satDNA Family', 'abundance': 'Abundance (%)'}, color='abundance')
fig.show()

### Correlation Between Length and Abundance

Analyzing the relationship between the length of satDNA sequences and their abundance.

In [None]:
corr = df[['length_bp', 'abundance']].corr()
print(corr)

In [None]:
fig = px.scatter(df, x='length_bp', y='abundance', size='divergence', color='AT_content', title='Correlation Between satDNA Length and Abundance', labels={'length_bp': 'Length (bp)', 'abundance': 'Abundance (%)'}, hover_data=['satDNA_family'])
fig.show()

### Divergence Analysis

Exploring the divergence rates of different satDNA families.

In [None]:
fig = px.box(df, y='divergence', title='Divergence Rates of satDNA Families', labels={'divergence': 'Divergence (%)'})
fig.show()

### A+T Content Distribution

Visualizing the distribution of A+T content across satDNA families.

In [None]:
fig = px.histogram(df, x='AT_content', bins=10, title='A+T Content Distribution of satDNA Families', labels={'AT_content': 'A+T Content (%)'})
fig.show()

### Conclusion

The analysis provides insights into the abundance, length, A+T content, and divergence of satDNA families in *Frieseomelitta varia*, highlighting the dominant role of FvarSat01-306 in the genome.





### [Created with BioloGPT](https://biologpt.com/?q=Paper%20Review%3A%20Cytogenomics%20of%20Frieseomelitta%20varia%20%28Hymenoptera%3A%20Apidae%29%20and%20the%20Sharing%20of%20a%20Satellite%20DNA%20Family%20in%20Several%20Neotropical%20Meliponini%20Genera)
[![BioloGPT Logo](https://biologpt.com/static/icons/bioinformatics_wizard.png)](https://biologpt.com/)
***