

# Workflow

```python
import sys
```
You’ll read the path to the read-counts file from `sys.argv[1]`.

```python
spliceevents5={}
spliceevents3={}
splicesites={}
splicesites5={}
splicesites3={}
```
All dictionaries:

- `spliceevents5`: a “seen-once” tracker for 5′ sites (key = `"chr	start"`).  
- `spliceevents3`: a “seen-once” tracker for 3′ sites (key = `"chr	end"`).  
- `splicesites`: maps a **full region** `"chr	start	end"` → `"gene	flag_list"`. If the exact same region appears again, it appends the new flag to `flag_list` (comma-separated).  
- `splicesites5`: keys are 5′ sites **only when they’re alternative** (i.e., seen ≥2 times). Values will later hold the **sum of reads** across all regions that share that 5′ coordinate.  
- `splicesites3`: same idea for 3′ sites that are **alternative**.

```python
with open("Splicesitestolookat_uniqueonly_groups.bed","r") as file:
    for line in file:
        entry=line.strip().split("	")
        splice5=entry[0]+"	"+entry[1]
        splice3=entry[0]+"	"+entry[2]
        splice=entry[0]+"	"+entry[1]+"	"+entry[2]
```
- Parse each BED line.
- Build keys:
  - `splice5` = `chr  start` → identifies the **5′ coordinate** of the junction.
  - `splice3` = `chr  end` → identifies the **3′ coordinate**.
  - `splice`  = `chr  start  end` → the **full region**.

```python
        if splice  in splicesites:
            splicesites[splice]+=","+entry[4]
        else:
            splicesites[splice]=entry[3]+"	"+entry[4]
```
- If we’ve already seen this exact region before, append the new 5th column (`entry[4]`) with a comma (building something like `"gene	flag1,flag2,..."`).
- Otherwise, initialize with `"gene	flag"`.

```python
            if splice5 in spliceevents5:
                splicesites5[splice5]=0
            else:
                spliceevents5[splice5]=0
            if splice3 in spliceevents3:
                splicesites3[splice3]=0
            else:
                spliceevents3[splice3]=0
```
This is the **alternative-site detector**:

- First time you see a 5′ key → store it in `spliceevents5` (seen once).  
- Second (or later) time the same 5′ appears → now it’s *alternative*: insert that key into `splicesites5` with value 0.  
  - Result: **only 5′ sites observed ≥2 times** end up in `splicesites5`.
- The same logic for 3′ into `spliceevents3`/`splicesites3`.

So `splicesites5`/`splicesites3` are **the sets of alternative sites** you’ll compute totals for.

---

```python
splicereadscountsfile=sys.argv[1]
readcounts={}
with open(splicereadscountsfile,"r") as file:
    for line in file:
        entry=line.strip().split("	")
        readcounts[entry[0]+"	"+entry[1]+"	"+str(int(entry[2])+1)]=int(entry[3])
```
- Read the *read-counts* file given as your first CLI argument.
- Build a key of the form `"chr	start	end_adjusted"` and store `read_count` as an integer.
- **Important detail:** `end` is coerced to `int` and then **`+1`**.  
  - This is an off-by-one harmonization step. BED is 0-based half-open; many counts tables are 1-based inclusive. Your code assumes the counts file’s `end` needs to be **shifted by +1** to match the BED regions you loaded earlier. If your actual counts file is already in the same coordinate system as the BED file, this `+1` will misalign keys.

---

```python
for region in splicesites:
    if region not in readcounts:
        readcounts[region]=0
```
- Ensure every region has a count (default to 0 if missing).

```python
    entry=region.split("	")
    splice5=entry[0]+"	"+entry[1]
    splice3=entry[0]+"	"+entry[2]
    if splice5 in  splicesites5:
        splicesites5[splice5]=splicesites5[splice5]+readcounts[region]
    if splice3 in  splicesites3:
        splicesites3[splice3]=splicesites3[splice3]+readcounts[region]
```
- For each region, add its **read count** to the appropriate group sums.

---

```python
for region in splicesites:
     entry=region.split("	")
     splice5=entry[0]+"	"+entry[1]
     splice3=entry[0]+"	"+entry[2]
```
- Iterate regions again and rebuild the 5′/3′ keys.

### Printing relative fractions for 5′-alternative groups
```python
     if splice5 in  splicesites5:
         if splicesites5[splice5] == 0:
             print (region+"	"+splicesites[region]+"	NA	5")
         else:
             rel=readcounts[region]/splicesites5[splice5]
             print (region+"	"+splicesites[region]+"	"+str(rel)+"	"+str(readcounts[region])+"	"+str(splicesites5[splice5])+"	5")
```

### Printing relative fractions for 3′-alternative groups
```python
     if splice3 in  splicesites3:
         if splicesites3[splice3] == 0:
             print (region+"	"+splicesites[region]+"	NA	3")
         else:
             rel=readcounts[region]/splicesites3[splice3]
             print (region+"	"+splicesites[region]+"	"+str(rel)+"	"+str(readcounts[region])+"	"+str(splicesites3[splice3])+"	3")
```

---

