We begin by loading the dataset of acceptor splicing sequences from the provided repository and pre-processing the 6-mer dispersion values for each splicing mode.

In [None]:
import pandas as pd
import scipy.stats as stats
import plotly.express as px

# Load dataset from URL
url = 'https://doi.org/10.6084/m9.figshare.26892364.v2'
df = pd.read_csv(url)

# Assume the dataframe has columns: 'region', 'splicing_mode', '6mer', 'dispersion'
# Filter for XY1 6-mers
df_xy1 = df[df['6mer'].str.contains('XY1')]

# Perform paired sample t-test for each region between two splicing modes (example: constitutive vs alternative)
results = []
for region in df_xy1['region'].unique():
    group = df_xy1[df_xy1['region'] == region]
    modes = group.groupby('splicing_mode')['dispersion'].apply(list)
    if 'constitutive' in modes and 'alternative' in modes:
        t_stat, p_val = stats.ttest_rel(modes['constitutive'], modes['alternative'])
        results.append({'region': region, 't_stat': t_stat, 'p_val': p_val})

result_df = pd.DataFrame(results)
print(result_df)

# Create a dispersion profile plot
fig = px.box(df_xy1, x='region', y='dispersion', color='splicing_mode', title='Dispersion Profiles of XY1 6-mer Subsets by Region')
fig.show()

The above code performs a comparative statistical analysis and visualizes the dispersion profiles across different splicing regions and modes.





***
### [**Evolve This Code**](https://biologpt.com/?q=Evolve%20Code%3A%20This%20code%20downloads%20and%20analyzes%20human%20acceptor%20splicing%20sequence%20dispersion%20data%2C%20generating%20statistical%20plots%20for%206-mer%20subset%20analysis%20to%20validate%20splicing%20mode%20distinctions.%0A%0AInclude%20error%20handling%2C%20dataset%20validation%2C%20and%20options%20to%20analyze%20additional%206-mer%20subsets%20or%20incorporate%20non-canonical%20splicing%20events.%0A%0AAnalysis%20of%20acceptor%20splicing%20sequences%20and%206-mer%20subsets%20in%20human%20genes%0A%0AWe%20begin%20by%20loading%20the%20dataset%20of%20acceptor%20splicing%20sequences%20from%20the%20provided%20repository%20and%20pre-processing%20the%206-mer%20dispersion%20values%20for%20each%20splicing%20mode.%0A%0Aimport%20pandas%20as%20pd%0Aimport%20scipy.stats%20as%20stats%0Aimport%20plotly.express%20as%20px%0A%0A%23%20Load%20dataset%20from%20URL%0Aurl%20%3D%20%27https%3A%2F%2Fdoi.org%2F10.6084%2Fm9.figshare.26892364.v2%27%0Adf%20%3D%20pd.read_csv%28url%29%0A%0A%23%20Assume%20the%20dataframe%20has%20columns%3A%20%27region%27%2C%20%27splicing_mode%27%2C%20%276mer%27%2C%20%27dispersion%27%0A%23%20Filter%20for%20XY1%206-mers%0Adf_xy1%20%3D%20df%5Bdf%5B%276mer%27%5D.str.contains%28%27XY1%27%29%5D%0A%0A%23%20Perform%20paired%20sample%20t-test%20for%20each%20region%20between%20two%20splicing%20modes%20%28example%3A%20constitutive%20vs%20alternative%29%0Aresults%20%3D%20%5B%5D%0Afor%20region%20in%20df_xy1%5B%27region%27%5D.unique%28%29%3A%0A%20%20%20%20group%20%3D%20df_xy1%5Bdf_xy1%5B%27region%27%5D%20%3D%3D%20region%5D%0A%20%20%20%20modes%20%3D%20group.groupby%28%27splicing_mode%27%29%5B%27dispersion%27%5D.apply%28list%29%0A%20%20%20%20if%20%27constitutive%27%20in%20modes%20and%20%27alternative%27%20in%20modes%3A%0A%20%20%20%20%20%20%20%20t_stat%2C%20p_val%20%3D%20stats.ttest_rel%28modes%5B%27constitutive%27%5D%2C%20modes%5B%27alternative%27%5D%29%0A%20%20%20%20%20%20%20%20results.append%28%7B%27region%27%3A%20region%2C%20%27t_stat%27%3A%20t_stat%2C%20%27p_val%27%3A%20p_val%7D%29%0A%0Aresult_df%20%3D%20pd.DataFrame%28results%29%0Aprint%28result_df%29%0A%0A%23%20Create%20a%20dispersion%20profile%20plot%0Afig%20%3D%20px.box%28df_xy1%2C%20x%3D%27region%27%2C%20y%3D%27dispersion%27%2C%20color%3D%27splicing_mode%27%2C%20title%3D%27Dispersion%20Profiles%20of%20XY1%206-mer%20Subsets%20by%20Region%27%29%0Afig.show%28%29%0A%0AThe%20above%20code%20performs%20a%20comparative%20statistical%20analysis%20and%20visualizes%20the%20dispersion%20profiles%20across%20different%20splicing%20regions%20and%20modes.%0A%0A)
***

### [Created with BioloGPT](https://biologpt.com/?q=Paper%20Review%3A%20Difference%20Analysis%20Among%20Six%20Kinds%20of%20Acceptor%20Splicing%20Sequences%20by%20the%20Dispersion%20Features%20of%206-mer%20Subsets%20in%20Human%20Genes)
[![BioloGPT Logo](https://biologpt.com/static/icons/bioinformatics_wizard.png)](https://biologpt.com/)
***