# Intro to Methods and Polymorphism
## With Strings and DNA Sequences

`string` objects in Python represent text.  They can be created in several ways:

```python
>>> 'Hello'  # with apostrophes ("single-quotes")
'Hello'

>>> "Hello"  # with quotation marks ("double-quotes")
"Hello"

>>> """Hello,     
... my name is
... Nick"""   # with triple-double-quotes (a "docstring", used for multi-line text)
'Hello,\nmy name is\nNick'  

>>> str(32)  # using the str() function to change into a string
'32'
```

Nucleotide sequences are often represented as strings:

```python
>>> seq = 'GCATTGGCT'
```

## Built-In String Operations, Functions, and Methods

Modify the dna sequences below in a single line of code to match what's asked for.  Functions and methods that may be used are:

### Operations

The same operations we've used on numbers and lists work on strings!

  - `'GTC' * 3  # Repeats a string N times`
  - `'GTC' + 'GTC'  # Concatenates two strings`
  - `'GTC'[0]`  
  - `'GTC'[-1]`
  - `'GTC'[1:]`
  - `'GTC'[:-1]`
  - `'GTC'[::-1]   # Reverses the sequence`
  - `'GTC' == 'GTC'  # If they are the same, then True`
  - `'GTC' != 'GTC'  # If they are different, then True`


### Built-In Functions for Strings and their Methods

Strings also contain their own functions.  Functions inside types are called **"Methods"**, and they are a way to automatically put the string into the function.

| Function | Method Syntax Equivalent |
| :--           |  :---- |
| `str.count('GTC', 'A')` |`'GTC'.count('A')` |
| `str.upper('GtC')` | `'GtC'.upper()` |
| `str.lower('GTc')` |  `'GTc'.lower()` |
| `str.isdigit('GTC')` |`'GTC'.isdigit()` |
| `str.index('GTC', 'T')` | `'GTC'.index('T')` |
| `str.replace('GTC', 'G', 'C')` |  `'GTC'.replace('G', 'C')` |
|  `str.split('GTC-CCA', '-')` | `'GTC-CCA'.split('-')` |
| `len('GTC')` |  -None- |


**Exercises**

**Example**: Count the Number of "G" in the following sequence:

In [2]:
seq = "GTGTCAGTCCCCATGAATCGATAG"
seq.count('G')

6

Count the Number of "C" in the following sequence:

In [4]:
seq = "GTGTCAGTCCCCATGAATCGATAG"

Count the number of "AT" repeats in the following sequence:

In [None]:
seq = "GTGTCAGTCCCCATGAATCGATAG"

Concatenate the following two sequences (i.e. combine them into one sequence)

In [None]:
seq1 = "GTGTCAGT"
seq2 = "TGAATCGATAG"

How long is the following sequence?

In [None]:
seq = "GTGTCAGTCCCCATGAATCGATAG"

What is the 2nd nucleotide in this sequence?

In [None]:
seq = "GTGTCAGTCCCCATGAATCGATAG"

What is the 3rd-from-the-last nucleotide in this sequence?

In [None]:
seq = "GTGTCAGTCCCCATGAATCGATAG"

Repeat the following sequence 13 times

In [None]:
gc = "GC"

Replace the incorrect letter with an empty string (i.e. delete the letter)  (Hint: an empty string is just a pair of quotes, like `''` or `""`)

In [None]:
seq = "GTGXXGTXCCXCCATGXAATCGXATA"

Keep only the first six nucleotides in this sequence

In [15]:
seq = "GTGTCAGTCCCCATGAATCGATAG"

Standardize the formatting of this sequence by either upper- or lower-casing the letters

In [None]:
seq = "GtCGAaaCCgTaGcTAgc"

Split the following string around the empty space into a list of sequences (Hint: the string for a space is quotes with a space between them, like `' '` or `" "`)

In [None]:
seqs = "GTTCGAAAG GACCTGATTATAG AACCGATTTA"

Reverse this sequence

In [None]:
seq = "GTGTCAGTCCCCATGAATCGATAG"

What percentage of strong nucleotides (G and C) are there in this sequence?  (Hint: count the Gs and Cs, then divide by the total number of nucleotides)

In [None]:
seq = "GTGTCAGTCCCCATGAATCGATAG"

Is this sequence the same forwards and backwards (i.e. a palindrome)?

In [None]:
seq = "TCGATCTAGCGCGAATATCGGAGAAGAGGCTATAAGCGCGATCTAGCT"

## Text Files

### Writing Strings to Files

Strings can be saved to text files by making a `File` object with the `open()` function and writing the string to it.  Here are two ways to do it:

```python
my_file = open('myfile.txt', 'w')  # get a file object open in 'write' mode
my_file.write('This is my text')  # call the file.write() method
my_file.close()  # call the file.close() method
```


### Reading Strings from Files

Reading works in a similar way

```python
my_file = open('myfile.txt')
text = my_file.read()
my_file.close()
```


**Exercises**:

Write the following sequence to a text file named "sequence.txt":

In [None]:
seq = "GTGTCAGTCCCCATGAATCGATAG"

Read the sequence from the file back into Python.

### Online Text

For getting text data from the internet, we can use the `urllib` package, which contains multiple subpackages for handling web requests and responses.

```python
from urllib.request import urlopen
url = "https://docs.python-requests.org/en/master/"
data = urlopen(url).read()
text = data.decode()
```

**Exercises**

Roughly how many letters are in William Shakespeare's play "Romeo and Juliet"?  Use the following link to the play:

In [1]:
url = "https://raw.githubusercontent.com/cgovella/learning/master/edx-python/case%20studies/gutenverg/Books/English/shakespeare/Romeo%20and%20Juliet.txt"

Does the word "Romeo" or "Juliet" appear more often in the text?

Write the play into a text file called "romeo_and_juliet.txt"