# Analysis of Gene ENSG00000187764 Peak Overlap Issues

This notebook analyzes why gene ENSG00000187764 failed the exonic fraction filter despite having significant peaks.

## Summary of Results:
- **Gene**: ENSG00000187764 on chromosome 9 (89,360,787 - 89,498,130, negative strand)
- **Best Peak**: chr9:89,367,013-89,367,102 (90bp peak)
- **Peak Score**: 28.0 (significant)
- **Exonic Fraction**: 0.0 (0% overlap with exons)
- **Failure Reason**: Failed exonic fraction filter (<0.1)

## Question: Why does the peak have 0% exonic overlap?

In [2]:
# Import Required Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path

# Set up plotting parameters
plt.style.use('default')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (12, 8)

In [3]:
# Load and Display Gene Data
gene_data = pd.DataFrame({
    'gene_id': ['ENSG00000187764'],
    'gene_chr': [9],
    'gene_start': [89360787],
    'gene_end': [89498130],
    'gene_strand': ['-'],
    'peak_chr': [9.0],
    'peak_start': [89367013.0],
    'peak_end': [89367102.0],
    'peak_strand': ['*'],
    'has_macs2_peaks': [True],
    'has_significant_peaks': [True],
    'has_overlapping_peaks': [True],
    'is_best_peak': [True],
    'passes_exonic_filter': [False],
    'final_selection': [False],
    'failure_reason': ['Failed exonic fraction filter (<0.1)'],
    'peak_count_raw': [5],
    'peak_count_significant': [5],
    'peak_count_overlapping': [5],
    'best_peak_score': [28.0],
    'best_peak_pvalue': [5.88149],
    'best_peak_qvalue': [2.80008],
    'exonic_fraction': [0.0]
})

print("Gene ENSG00000187764 Analysis Results:")
print("="*50)
for col in gene_data.columns:
    print(f"{col}: {gene_data[col].iloc[0]}")
    
print("\nKey Observations:")
print(f"• Gene spans: {gene_data['gene_end'].iloc[0] - gene_data['gene_start'].iloc[0]:,} bp")
print(f"• Peak spans: {int(gene_data['peak_end'].iloc[0] - gene_data['peak_start'].iloc[0])} bp")
print(f"• Peak is {gene_data['peak_start'].iloc[0] - gene_data['gene_start'].iloc[0]:,.0f} bp from gene start")

Gene ENSG00000187764 Analysis Results:
gene_id: ENSG00000187764
gene_chr: 9
gene_start: 89360787
gene_end: 89498130
gene_strand: -
peak_chr: 9.0
peak_start: 89367013.0
peak_end: 89367102.0
peak_strand: *
has_macs2_peaks: True
has_significant_peaks: True
has_overlapping_peaks: True
is_best_peak: True
passes_exonic_filter: False
final_selection: False
failure_reason: Failed exonic fraction filter (<0.1)
peak_count_raw: 5
peak_count_significant: 5
peak_count_overlapping: 5
best_peak_score: 28.0
best_peak_pvalue: 5.88149
best_peak_qvalue: 2.80008
exonic_fraction: 0.0

Key Observations:
• Gene spans: 137,343 bp
• Peak spans: 89 bp
• Peak is 6,226 bp from gene start


In [4]:
# Analyze Peak Characteristics in Detail
gene_start = int(gene_data['gene_start'].iloc[0])
gene_end = int(gene_data['gene_end'].iloc[0])
peak_start = int(gene_data['peak_start'].iloc[0])
peak_end = int(gene_data['peak_end'].iloc[0])

print("📍 COORDINATE ANALYSIS")
print("="*40)
print(f"Gene ENSG00000187764 (chromosome 9, negative strand):")
print(f"  🧬 Gene span: {gene_start:,} - {gene_end:,} ({gene_end-gene_start:,} bp)")
print(f"  🎯 Peak span: {peak_start:,} - {peak_end:,} ({peak_end-peak_start:,} bp)")
print()

print("📏 RELATIVE POSITIONS")
print("="*40)
distance_from_start = peak_start - gene_start
distance_from_end = gene_end - peak_end
relative_position = (peak_start - gene_start) / (gene_end - gene_start) * 100

print(f"  • Peak is {distance_from_start:,} bp downstream of gene start")
print(f"  • Peak is {distance_from_end:,} bp upstream of gene end") 
print(f"  • Peak is at {relative_position:.1f}% of gene length")
print()

print("🔬 PEAK PROPERTIES")
print("="*40)
print(f"  • Peak score: {gene_data['best_peak_score'].iloc[0]}")
print(f"  • P-value: {gene_data['best_peak_pvalue'].iloc[0]:.2f}")
print(f"  • Q-value: {gene_data['best_peak_qvalue'].iloc[0]:.2f}")
print(f"  • Total peaks found: {gene_data['peak_count_raw'].iloc[0]}")
print(f"  • Significant peaks: {gene_data['peak_count_significant'].iloc[0]}")
print(f"  • Exonic fraction: {gene_data['exonic_fraction'].iloc[0]:.1%}")

if gene_data['exonic_fraction'].iloc[0] == 0.0:
    print("  ⚠️  WARNING: Peak has 0% exonic overlap!")

📍 COORDINATE ANALYSIS
Gene ENSG00000187764 (chromosome 9, negative strand):
  🧬 Gene span: 89,360,787 - 89,498,130 (137,343 bp)
  🎯 Peak span: 89,367,013 - 89,367,102 (89 bp)

📏 RELATIVE POSITIONS
  • Peak is 6,226 bp downstream of gene start
  • Peak is 131,028 bp upstream of gene end
  • Peak is at 4.5% of gene length

🔬 PEAK PROPERTIES
  • Peak score: 28.0
  • P-value: 5.88
  • Q-value: 2.80
  • Total peaks found: 5
  • Significant peaks: 5
  • Exonic fraction: 0.0%


In [None]:
# Visualize Peak Location Relative to Gene
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(14, 10))

# Top plot: Full gene view
ax1.barh(0, gene_end - gene_start, left=gene_start, height=0.3, 
         color='lightblue', alpha=0.7, label='Gene ENSG00000187764')
ax1.barh(0, peak_end - peak_start, left=peak_start, height=0.5, 
         color='red', alpha=0.8, label='MACS2 Peak')

ax1.set_xlim(gene_start - 10000, gene_end + 10000)
ax1.set_ylim(-0.5, 0.5)
ax1.set_xlabel('Chromosome 9 Position')
ax1.set_title('Gene ENSG00000187764 and Peak Location Overview', fontsize=14, fontweight='bold')
ax1.legend()
ax1.grid(True, alpha=0.3)

# Format x-axis with commas
ax1.xaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'{x:,.0f}'))

# Bottom plot: Zoomed view around peak
zoom_start = peak_start - 5000
zoom_end = peak_end + 5000

ax2.barh(0, gene_end - gene_start, left=gene_start, height=0.3, 
         color='lightblue', alpha=0.7, label='Gene span')
ax2.barh(0, peak_end - peak_start, left=peak_start, height=0.5, 
         color='red', alpha=0.8, label='Peak (90 bp)')

ax2.set_xlim(zoom_start, zoom_end)
ax2.set_ylim(-0.5, 0.5)
ax2.set_xlabel('Chromosome 9 Position (Zoomed)')
ax2.set_title('Zoomed View: Peak Region Detail', fontsize=14, fontweight='bold')
ax2.legend()
ax2.grid(True, alpha=0.3)
ax2.xaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'{x:,.0f}'))

# Add annotations
ax1.annotate(f'Peak: {peak_start:,}-{peak_end:,}', 
             xy=(peak_start, 0.25), xytext=(peak_start, 0.4),
             arrowprops=dict(arrowstyle='->', color='red', lw=2),
             fontsize=10, ha='center', color='red', fontweight='bold')

ax2.annotate('90 bp peak\n(0% exonic)', 
             xy=(peak_start + (peak_end-peak_start)/2, 0.25), xytext=(peak_start + (peak_end-peak_start)/2, 0.4),
             arrowprops=dict(arrowstyle='->', color='red', lw=2),
             fontsize=10, ha='center', color='red', fontweight='bold')

plt.tight_layout()
plt.show()

print(f"📊 VISUALIZATION SUMMARY:")
print(f"The peak (red bar) is located at the beginning of the gene region.")
print(f"Peak position relative to gene: {relative_position:.1f}% from start")

## 🔍 Why Does the Peak Have 0% Exonic Overlap?

The **exonic fraction = 0.0** indicates that the peak **does not overlap with any exons** of gene ENSG00000187764. Here are the most likely explanations:

### 1. **Peak in Intron** 🧬
- The peak is located within an **intronic region** of the gene
- Introns are non-coding sequences between exons
- RNA-seq peaks in introns can indicate:
  - **Nascent RNA transcription** (still being processed)
  - **Alternative splicing sites**
  - **Regulatory elements** within introns

### 2. **Peak in Promoter/UTR Region** 📍
- The peak could be in the **5' UTR** or **promoter region**
- For a **negative strand gene**, the peak near the gene start (chr9:89,367,013) could be:
  - **3' UTR region** (since negative strand genes read 3'→5')
  - **Regulatory sequences** downstream of the main coding region

### 3. **Gene Annotation vs. Peak Location** 📝
- The gene spans **137,343 bp** (huge gene!)
- The peak is only **90 bp** and located early in the gene
- Large genes often have complex structures with many introns

### 4. **Filtering Implications** ⚠️
The `--min_exonic_fraction 0.1` filter requires **≥10% exonic overlap**, but this peak has **0% overlap**, causing it to fail the filter despite being:
- **Statistically significant** (score: 28.0)
- **Within the gene boundaries**
- **Reproducible** (5/5 peaks passed significance)

In [5]:
# Extract and analyze exon coordinates from GTF analysis
print("🧬 DETAILED EXON STRUCTURE ANALYSIS")
print("="*50)

# Peak coordinates from our analysis
peak_start = 89367013  # From previous analysis (corrected)
peak_end = 89367102    # From previous analysis (corrected)

print(f"Peak location: chr9:{peak_start:,}-{peak_end:,}")
print(f"Peak size: {peak_end - peak_start} bp")
print()

# Key exon ranges near the peak (from GTF analysis)
print("NEAREST EXONS TO PEAK:")
print("-" * 30)

# Exons that bracket our peak
exons_near_peak = [
    (89363430, 89365654, "Large exon upstream"),
    (89367123, 89369519, "Large exon downstream")
]

for exon_start, exon_end, description in exons_near_peak:
    print(f"  {description}:")
    print(f"    Exon: chr9:{exon_start:,}-{exon_end:,} ({exon_end-exon_start:,} bp)")
    
    # Check overlap with peak
    overlap_start = max(peak_start, exon_start)
    overlap_end = min(peak_end, exon_end)
    
    if overlap_start <= overlap_end:
        overlap_size = overlap_end - overlap_start + 1
        print(f"    ✅ OVERLAPS with peak ({overlap_size} bp)")
    else:
        distance = min(abs(peak_start - exon_end), abs(peak_end - exon_start))
        position = "upstream" if peak_start > exon_end else "downstream"
        print(f"    ❌ No overlap - peak is {distance:,} bp {position}")
    print()

# Calculate exact intronic gap where peak is located
upstream_exon_end = 89365654
downstream_exon_start = 89367123
intron_size = downstream_exon_start - upstream_exon_end - 1

print("🎯 PEAK LOCATION ANALYSIS:")
print("-" * 30)
print(f"Peak is located in INTRON between:")
print(f"  • Upstream exon ends at: {upstream_exon_end:,}")
print(f"  • Downstream exon starts at: {downstream_exon_start:,}")
print(f"  • Intron size: {intron_size:,} bp")
print(f"  • Peak position in intron: {peak_start - upstream_exon_end:,} bp from upstream exon")
print(f"  • Distance to downstream exon: {downstream_exon_start - peak_end:,} bp")

print(f"\n💡 CONCLUSION:")
print(f"The peak is located in a {intron_size:,} bp intron, explaining the 0% exonic overlap!")
print(f"Peak occupies positions {peak_start - upstream_exon_end:,}-{peak_end - upstream_exon_end:,} within this intron.")

🧬 DETAILED EXON STRUCTURE ANALYSIS
Peak location: chr9:89,367,013-89,367,102
Peak size: 89 bp

NEAREST EXONS TO PEAK:
------------------------------
  Large exon upstream:
    Exon: chr9:89,363,430-89,365,654 (2,224 bp)
    ❌ No overlap - peak is 1,359 bp upstream

  Large exon downstream:
    Exon: chr9:89,367,123-89,369,519 (2,396 bp)
    ❌ No overlap - peak is 21 bp downstream

🎯 PEAK LOCATION ANALYSIS:
------------------------------
Peak is located in INTRON between:
  • Upstream exon ends at: 89,365,654
  • Downstream exon starts at: 89,367,123
  • Intron size: 1,468 bp
  • Peak position in intron: 1,359 bp from upstream exon
  • Distance to downstream exon: 21 bp

💡 CONCLUSION:
The peak is located in a 1,468 bp intron, explaining the 0% exonic overlap!
Peak occupies positions 1,359-1,448 within this intron.


In [None]:
# Create detailed exon-intron structure visualization
fig, ax = plt.subplots(1, 1, figsize=(16, 6))

# Define coordinates for the region around the peak
region_start = 89360000
region_end = 89375000

# Key exons near the peak
upstream_exon = (89363430, 89365654)
downstream_exon = (89367123, 89369519)
peak_coords = (89367013, 89367102)

# Plot gene backbone
ax.barh(0, region_end - region_start, left=region_start, height=0.1, 
        color='lightgray', alpha=0.5, label='Gene region')

# Plot exons
ax.barh(0, upstream_exon[1] - upstream_exon[0], left=upstream_exon[0], height=0.4, 
        color='blue', alpha=0.8, label='Exons')
ax.barh(0, downstream_exon[1] - downstream_exon[0], left=downstream_exon[0], height=0.4, 
        color='blue', alpha=0.8)

# Plot intron
intron_start = upstream_exon[1]
intron_end = downstream_exon[0]
ax.barh(0, intron_end - intron_start, left=intron_start, height=0.15, 
        color='yellow', alpha=0.6, label='Intron')

# Plot peak
ax.barh(0, peak_coords[1] - peak_coords[0], left=peak_coords[0], height=0.6, 
        color='red', alpha=0.9, label='MACS2 Peak')

# Formatting
ax.set_xlim(region_start, region_end)
ax.set_ylim(-0.5, 0.5)
ax.set_xlabel('Chromosome 9 Position')
ax.set_title('Gene ENSG00000187764: Detailed Exon-Intron Structure and Peak Location', 
             fontsize=14, fontweight='bold')
ax.legend(loc='upper right')
ax.grid(True, alpha=0.3)

# Format x-axis
ax.xaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'{x:,.0f}'))

# Add annotations
ax.annotate('Upstream Exon\n(2,224 bp)', 
            xy=(upstream_exon[0] + (upstream_exon[1]-upstream_exon[0])/2, 0.2), 
            xytext=(upstream_exon[0] + (upstream_exon[1]-upstream_exon[0])/2, 0.35),
            arrowprops=dict(arrowstyle='->', color='blue', lw=1),
            fontsize=9, ha='center', color='blue', fontweight='bold')

ax.annotate('Downstream Exon\n(2,396 bp)', 
            xy=(downstream_exon[0] + (downstream_exon[1]-downstream_exon[0])/2, 0.2), 
            xytext=(downstream_exon[0] + (downstream_exon[1]-downstream_exon[0])/2, 0.35),
            arrowprops=dict(arrowstyle='->', color='blue', lw=1),
            fontsize=9, ha='center', color='blue', fontweight='bold')

ax.annotate('Peak in Intron\n(89 bp)\n0% exonic overlap', 
            xy=(peak_coords[0] + (peak_coords[1]-peak_coords[0])/2, 0.3), 
            xytext=(peak_coords[0] + (peak_coords[1]-peak_coords[0])/2, 0.45),
            arrowprops=dict(arrowstyle='->', color='red', lw=2),
            fontsize=10, ha='center', color='red', fontweight='bold',
            bbox=dict(boxstyle="round,pad=0.3", facecolor='white', alpha=0.8))

ax.annotate('Intron (1,468 bp)', 
            xy=(intron_start + (intron_end-intron_start)/2, 0.075), 
            xytext=(intron_start + (intron_end-intron_start)/2, -0.15),
            arrowprops=dict(arrowstyle='->', color='orange', lw=1),
            fontsize=9, ha='center', color='orange', fontweight='bold')

plt.tight_layout()
plt.show()

print("\n" + "🎯" * 30)
print("FINAL ANALYSIS SUMMARY")
print("🎯" * 30)
print(f"✅ Peak Quality: EXCELLENT (score: 28.0, q-value: 2.80)")
print(f"✅ Peak Location: WITHIN gene boundaries ✓")
print(f"❌ Exonic Overlap: 0% - peak is in 1,468 bp intron")
print(f"❌ Filter Result: FAILED min_exonic_fraction ≥ 0.1")
print()
print("📍 EXACT LOCATION:")
print(f"   Peak: chr9:89,367,013-89,367,102 (89 bp)")
print(f"   Intron: chr9:89,365,655-89,367,122 (1,468 bp)")
print(f"   Position: 21 bp upstream of next exon")
print()
print("🔧 RECOMMENDED SOLUTIONS:")
print("   1. Use --min_exonic_fraction 0.0 (allow intronic peaks)")
print("   2. Use --trim_to_exon false (include regulatory regions)")
print("   3. Consider biological relevance of intronic regulatory elements")

In [None]:
# Examine Statistical Significance and Quality Metrics
print("📈 STATISTICAL SIGNIFICANCE ANALYSIS")
print("="*50)

# Peak quality assessment
peak_score = gene_data['best_peak_score'].iloc[0]
peak_pvalue = gene_data['best_peak_pvalue'].iloc[0] 
peak_qvalue = gene_data['best_peak_qvalue'].iloc[0]

print(f"Peak Quality Metrics:")
print(f"  • MACS2 Score: {peak_score}")
if peak_score >= 10:
    print("    ✅ Score ≥ 10 (meets min_peak_score threshold)")
else:
    print("    ❌ Score < 10 (below min_peak_score threshold)")

print(f"  • P-value: {peak_pvalue:.2f}")
print(f"  • Q-value: {peak_qvalue:.2f}")

# Check against thresholds used in the pipeline
qvalue_threshold = 0.1  # Based on output directory name
min_peak_score = 10     # From the command

print(f"\nThreshold Comparisons:")
print(f"  • Q-value threshold used: ≤ {qvalue_threshold}")
if peak_qvalue <= qvalue_threshold:
    print("    ✅ Peak passes q-value filter")
else:
    print("    ❌ Peak fails q-value filter")

print(f"  • Min peak score used: ≥ {min_peak_score}")
if peak_score >= min_peak_score:
    print("    ✅ Peak passes score filter")
else:
    print("    ❌ Peak fails score filter")

print(f"  • Min exonic fraction used: ≥ 0.1 (10%)")
exonic_fraction = gene_data['exonic_fraction'].iloc[0]
if exonic_fraction >= 0.1:
    print("    ✅ Peak passes exonic fraction filter")
else:
    print(f"    ❌ Peak fails exonic fraction filter ({exonic_fraction:.1%} < 10%)")

print(f"\n🎯 FILTERING PIPELINE RESULTS:")
print(f"  1. Raw peaks found: {gene_data['peak_count_raw'].iloc[0]}")
print(f"  2. Significant peaks: {gene_data['peak_count_significant'].iloc[0]} ✅")
print(f"  3. Overlapping with gene: {gene_data['peak_count_overlapping'].iloc[0]} ✅") 
print(f"  4. Best peak selected: {'Yes' if gene_data['is_best_peak'].iloc[0] else 'No'} ✅")
print(f"  5. Passes exonic filter: {'Yes' if gene_data['passes_exonic_filter'].iloc[0] else 'No'} ❌")
print(f"  6. Final selection: {'Yes' if gene_data['final_selection'].iloc[0] else 'No'} ❌")
print(f"\n💡 The gene failed at step 5 due to insufficient exonic overlap!")

In [None]:
# Create a comprehensive summary visualization
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(2, 2, figsize=(16, 12))

# 1. Peak filtering pipeline flow
stages = ['Raw\nPeaks', 'Significant\nPeaks', 'Gene\nOverlap', 'Best\nPeak', 'Exonic\nFilter', 'Final\nSelection']
counts = [5, 5, 5, 1, 0, 0]
colors = ['green', 'green', 'green', 'green', 'red', 'red']

bars = ax1.bar(stages, counts, color=colors, alpha=0.7)
ax1.set_ylabel('Peak Count')
ax1.set_title('ENSG00000187764: Peak Filtering Pipeline', fontweight='bold')
ax1.set_ylim(0, 6)

# Add count labels on bars
for bar, count in zip(bars, counts):
    if count > 0:
        ax1.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.1, 
                str(count), ha='center', va='bottom', fontweight='bold')

# Add failure point annotation
ax1.annotate('FAILURE\nPOINT', xy=(4, 0), xytext=(4, 3),
            arrowprops=dict(arrowstyle='->', color='red', lw=3),
            fontsize=12, ha='center', color='red', fontweight='bold')

# 2. Quality metrics comparison
metrics = ['Peak Score', 'Q-value', 'Exonic Fraction']
values = [28.0, 2.80, 0.0]
thresholds = [10.0, 0.1, 0.1]  # Note: q-value threshold is 0.1 based on directory name
colors_metrics = ['green' if v >= t else 'red' for v, t in zip(values, thresholds)]

bars2 = ax2.bar(metrics, values, color=colors_metrics, alpha=0.7)
ax2.axhline(y=10, color='gray', linestyle='--', alpha=0.5, label='Score threshold (10)')
ax2.axhline(y=0.1, color='gray', linestyle='--', alpha=0.5, label='Min thresholds (0.1)')

ax2.set_ylabel('Value')
ax2.set_title('Quality Metrics vs Thresholds', fontweight='bold')
ax2.legend()

# Add value labels
for bar, value in zip(bars2, values):
    ax2.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 0.05, 
            f'{value:.1f}', ha='center', va='bottom', fontweight='bold')

# 3. Gene and peak coordinates
ax3.barh(1, gene_end - gene_start, left=gene_start, height=0.2, 
         color='lightblue', alpha=0.7, label='Gene ENSG00000187764')
ax3.barh(1, peak_end - peak_start, left=peak_start, height=0.4, 
         color='red', alpha=0.8, label='Peak (90 bp)')

ax3.set_xlim(gene_start - 20000, gene_start + 50000)  # Focus on gene start region
ax3.set_ylim(0.5, 1.5)
ax3.set_xlabel('Chromosome 9 Position')
ax3.set_title('Peak Location Detail (Near Gene Start)', fontweight='bold')
ax3.legend()
ax3.xaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'{x:,.0f}'))

# 4. Summary statistics pie chart
labels = ['Passed Filters', 'Failed at Exonic\nFraction']
sizes = [0, 1]  # This gene failed
colors_pie = ['lightgreen', 'lightcoral']
explode = (0, 0.1)  # explode the failure slice

ax4.pie(sizes, labels=labels, colors=colors_pie, autopct='%1.0f%%',
        startangle=90, explode=explode)
ax4.set_title('Gene Selection Outcome', fontweight='bold')

plt.tight_layout()
plt.show()

print("\n" + "="*60)
print("🎯 KEY INSIGHTS FOR GENE ENSG00000187764")
print("="*60)
print("✅ STRENGTHS:")
print("   • High-quality peak (score: 28.0)")
print("   • Statistically significant (q-value: 2.80)")
print("   • Reproducible (5/5 peaks passed filters)")
print("   • Located within gene boundaries")
print()
print("❌ WEAKNESS:")
print("   • Peak is in NON-EXONIC region (0% exonic overlap)")
print("   • Likely in intron, UTR, or regulatory region")
print()
print("🔧 SOLUTION:")
print("   • Reduce --min_exonic_fraction to 0.0 to include intronic peaks")
print("   • Or use --trim_to_exon false for regulatory regions")

## 📋 Summary: Why Peak RNA021484_1Aligned_peak_140904 Failed

### **The Problem** ❌
Gene **ENSG00000187764** has a **high-quality, significant peak** but **failed the exonic fraction filter** because:

1. **Peak Location**: chr9:89,367,013-89,367,102 (90 bp)
2. **Exonic Overlap**: 0.0% (completely non-exonic)
3. **Filter Requirement**: ≥10% exonic overlap (`--min_exonic_fraction 0.1`)
4. **Result**: Peak rejected despite being statistically significant

### **Why This Happens** 🧬
The peak is located in a **non-coding region** of the gene:
- **Intronic sequence** (most likely)
- **UTR region** (possible)
- **Regulatory element** (possible)

For large genes (137,343 bp like this one), peaks often fall in introns between distant exons.

### **Biological Significance** 🔬
Non-exonic peaks can still be **biologically meaningful**:
- **Intronic regulatory elements**
- **Alternative splicing signals**
- **Enhancer sequences**
- **Nascent RNA processing sites**

### **Solutions for Low-Coverage Genes** 🛠️
```bash
# Option 1: Allow intronic peaks
--min_exonic_fraction 0.0

# Option 2: More permissive threshold  
--min_exonic_fraction 0.05  # 5% minimum

# Option 3: Disable exon trimming
--trim_to_exon false
```

### **Conclusion** 💡
This is a **filter design issue**, not a peak quality issue. The peak is scientifically valid but excluded by overly strict exonic requirements unsuitable for complex gene structures or regulatory regions.