# Define Event Boundaries

This notebook gives you an idea of where gain and loss regions are in a chromosome, so you can decide what specific boundaries to define for a graphically observed particular event.

Use the chart at the end of the notebook to determine what boundaries make the most sense. If you mouse over the chart, and don't have any weird JavaScript blockers installed, the chart will be interactive and a box will pop up telling you the start and end of each region, as well as the number of cancer types who have at least 20% of patients showing that event:

![tooltip_example](../data/tooltip_example.png)

You can also refer to the gain/loss region tables in this notebook, and the plots from the previous notebooks. Also keep in mind where the centromere is.

For each arm, record the start and end boundaries for the event on that arm in the `00_setup/00_set_arm_parameters.ipynb` notebook in the folder for that arm.

For example if I was looking at 6 cancer types, I would ideally choose a boundary that included each of the regions where all 6 cancer types had the event while not including any regions with only a few. However, the regions in the table won't always coincide nicely with the plot, so sometimes you need to include a region with just 5 cancer types as a boundary to more closely resemble what you see.

In [1]:
import cnvutils

CHROMOSOME = 8

### CPTAC

In [2]:
cptac_plot, cptac_gains, cptac_losses = cnvutils.find_gain_and_loss_regions(
    source="cptac",
    chromosome=CHROMOSOME,
)

  events = gains.append(losses)


In [3]:
cptac_plot

In [4]:
cptac_gains

Unnamed: 0,start,end,counts,length,event
0,0,8701937,0,8701937,gain
1,8701937,12064389,1,3362452,gain
2,12064389,14084845,0,2020456,gain
3,14084845,18170477,1,4085632,gain
4,18170477,18391282,0,220805,gain
5,18391282,20246165,1,1854883,gain
6,20246165,22089150,0,1842985,gain
7,22089150,22165140,1,75990,gain
8,22165140,30384511,0,8219371,gain
9,30384511,31639222,1,1254711,gain


In [5]:
cptac_losses

Unnamed: 0,start,end,counts,length,event
0,0,166086,0,166086,loss
1,166086,406428,-1,240342,loss
2,406428,2935353,-6,2528925,loss
3,2935353,7355517,-7,4420164,loss
4,7355517,8317736,-6,962219,loss
5,8317736,12064389,-7,3746653,loss
6,12064389,12721906,-6,657517,loss
7,12721906,31639222,-7,18917316,loss
8,31639222,36784324,-6,5145102,loss
9,36784324,37695782,-5,911458,loss


### GISTIC

In [6]:
gistic_plot, gistic_gains, gistic_losses = cnvutils.find_gain_and_loss_regions(
    source="gistic",
    chromosome=CHROMOSOME,
    level="gene",
)

  events = gains.append(losses)


In [7]:
gistic_plot

In [8]:
gistic_gains

Unnamed: 0,start,end,counts,length,event
0,0.0,33308060.0,0,33308060.0,gain
1,33308060.0,33591329.0,1,283269.0,gain
2,33591329.0,35235474.0,2,1644145.0,gain
3,35235474.0,35672162.0,4,436688.0,gain
4,35672162.0,36784373.0,3,1112211.0,gain
5,36784373.0,48710788.0,5,11926415.0,gain
6,48710788.0,91954966.0,6,43244178.0,gain
7,91954966.0,94895798.0,7,2940832.0,gain
8,94895798.0,99013273.0,6,4117475.0,gain
9,99013273.0,100133350.0,7,1120077.0,gain


In [9]:
gistic_losses

Unnamed: 0,start,end,counts,length,event
0,0.0,166085.0,0,166085.0,loss
1,166085.0,12727231.0,-8,12561146.0,loss
2,12727231.0,36784373.0,-7,24057142.0,loss
3,36784373.0,37403515.0,-6,619142.0,loss
4,37403515.0,38382430.0,-5,978915.0,loss
5,38382430.0,40153481.0,-4,1771051.0,loss
6,40153481.0,41261961.0,-3,1108480.0,loss
7,41261961.0,42391760.0,-1,1129799.0,loss
8,42391760.0,42752619.0,0,360859.0,loss
9,42752619.0,43056322.0,-1,303703.0,loss


## Save metadata

In [10]:
CHROMOSOME = 8

cptac_cancer_types = [
    "brca",
#     "ccrcc", # Event not seen
    "coad",
#     "gbm", # Event not seen
    "hnscc",
    "lscc",
    "luad",
    "ov",
#     "pdac", # Event not seen
#     "ucec", # Event not seen
]

gistic_cancer_types = [
    "brca",
#     "ccrcc", # Event not seen
    "coad",
#     "gbm", # Event not seen
    "hnscc",
    "lscc",
    "luad",
    "ov",
    "pdac",
#     "ucec", # Event not seen
]

# CPTAC 8p loss
cnvutils.save_event_metadata(
    metadata={
        "cancer_types": cptac_cancer_types,
        "start": 406428, # 406,428 bp
        "end": 36784324, # 36,784,324 bp
        "type": "loss",   
    },
    source="cptac",
    chromosome=CHROMOSOME,
    arm="p",
    gain_or_loss="loss",
)

# CPTAC 8q gain
cnvutils.save_event_metadata(
    metadata={
        "cancer_types": cptac_cancer_types,
        "start": 48710789, # 48,710,789 bp
        "end": 145052465, # 145,052,465 bp 
        "type": "gain",   
    },
    source="cptac",
    chromosome=CHROMOSOME,
    arm="q",
    gain_or_loss="gain",
)

# GISTIC 8p loss
cnvutils.save_event_metadata(
    metadata={
        "cancer_types": gistic_cancer_types,
        "start": 166085, # 166,085 bp
        "end": 37403515, # 37,403,515 bp
        "type": "loss",    
    },
    source="gistic",
    level="gene",
    chromosome=CHROMOSOME,
    arm="p",
    gain_or_loss="loss",
)

# GISTIC 8q gain
cnvutils.save_event_metadata(
    metadata={
        "cancer_types": gistic_cancer_types,
        "start": 48710788, # 48,710,788
        "end": 145052466, # 145,052,466
        "type": "gain",
    },
    source="gistic",
    level="gene",
    chromosome=CHROMOSOME,
    arm="q",
    gain_or_loss="gain",
)