Problem

Given two strings $s$ and $t$, $t$ is a substring of $s$ if $t$ is contained as a contiguous
collection of symbols in $s$ (as a result, $t$ must be no longer than s).

The position of a symbol in a string is the total number of symbols found to its left, including
itself (e.g., the positions of all occurrences of 'U' in "AUGCUUCAGAAAGGUCUUACG" are 2, 5, 6, 15,
17, and 18). The symbol at position $i$ of $s$ is denoted by s[i].

A substring of $s$ can be represented as s[j:k], where j and k represent the starting and ending
positions of the substring in $s$; for example, if $s$ = "AUGCUUCAGAAAGGUCUUACG", then $s[2:5]$ =
"UGCU".

The location of a substring $s[j:k]$ is its beginning position $j$; note that $t$ will have multiple
locations in $s$ if it occurs more than once as a substring of $s$ (see the Sample below).

<font color="green">Given</font>: Two DNA strings $s$ and $t$ (each of length at most 1 kbp).

<font color="green">Return</font>: All locations of $t$ as a substring of $s$.

**Sample Dataset**

```
GATATATGCATATACTT
ATAT
```

**Sample Output**

```
2 4 10
```


### Practice

In [7]:
dna = "GATATATGCATATACTT"
motif = "ATAT"

In [11]:
def find_motif(dna, motif):
    motif_length = len(motif)

    motif_locations = []
    for i in range(len(dna) - motif_length + 1):
        if dna[i:i+motif_length] == motif:
            motif_locations.append(i+1)

    return motif_locations

In [12]:
locs = find_motif(dna, motif)
print(*locs)

2 4 10


### Real data

In [13]:
with open("../files/SUBS_real.txt", "r") as infile:
    dna, motif = infile.read().splitlines()

In [14]:
locs = find_motif(dna, motif)
print(*locs)

220 417 434 473 480 504 511 610 617 648 664 692 699 716 763 776 800 807
