# Session 3 : Detecting DNA Polymorphisms - Conditional Statements

## Learning Objectives

* Escapes
* Conditional statements

## 3.3 Escape characters

Non-printable characters like line breaks are called escapes that can be represented using backslash notation.  We will make frequent use of

    \n	New line
    \t	Tab

An escape character gets interpreted in a single quoted as well as double and triple quoted strings.   See the python documentation for a complete list of escape characters.  The backslash can also be used to negate the effect of a special character which is particular useful in printing.  A backslash can also be used to print a single or double quote without resorting to triple quotes.


In [1]:
#!/usr/bin/env python

# Example 3.1
# Name: escape_characters.py
# Description: A program to test escape characters

print ('Hello \n world!')
print ('Hello \t\t world!')
print ('Hello \\n world!')
print ('I\'m here!')

Hello 
 world!
Hello 		 world!
Hello \n world!
I'm here!


## 3.2 Flow Control

Two of the most essential components of programming are logic and repetition. To add these to your Python programs, there are three key words that you will need: <b>if</b>, <b>for</b>, and <b>while</b>.  

A program executes from the first statement at the top of the program to the last statement at the bottom, in order, unless told to do otherwise. You can control the order in which the statements of a program are executed by using conditional statements and loops. A conditional statement executes a group of statements only if the conditional test succeeds; otherwise, it just skips the group of statements.

A conditional evaluates whether a statement is "true" or "false". However, "What is truth?" it is a question to carefully consider when programming. 

The basic form of an <b>if</b> statement is 

<pre>
if (condition) :
    block
</pre>

Here is an example program:

In [2]:
#!/usr/bin/env python

# Example 3.2
# Name: If.py
# Description:  This program tests the if statement

DNA = 'AGTTGTAATGAGGCTGCCGTGATA'
DNA_length = len(DNA)

if (DNA_length < 100) :
    print ('This is a short piece of DNA')

This is a short piece of DNA


The are 3 important things to remember with the <b>if</b> statement; the brackets (), the colon and to indent the block following the colon.  There can be multiple lines in the block, as long as they are all indented the same amount.

### if...else

The basic form of an <b>if</b> statement is 
<pre>
if (condition) :
    block
else :
    block
</pre>

In [3]:
#!/usr/bin/env python

# Example 3.3
# Name: If_else.py
# Description:  This program tests the if...else statement

DNA = 'AGTTGTAATGAGGCTGCCGTGATA'
DNA_length = len(DNA)

if (DNA_length > 100) :
    print ('This is a long piece of DNA')

else :
    print ('This is a short piece of DNA')

This is a short piece of DNA


### if...elif...else

We could also use an if..elif..else statement which has the form

<pre>
if (condition) :
    block
elif (condition 2) :
    block
elif (condition 3) :
    block
else :
    block
</pre>


In [4]:
#!/usr/bin/env python

# Example 3.4
# Name: If_elif.py
# Description:  This program tests the if...elif statement

DNA = 'AGTTGTAATGAGGCTGCCGTGATA'
DNA_length = len(DNA)

if (DNA_length > 500) :
    print ('This is a long piece of DNA')
elif (100 < DNA_length <= 500) :
    print ('This is a medium piece of DNA')
else:
    print ('This is a short piece of DNA')

This is a short piece of DNA


## 3.3 Operators

In the above examples we used the greater than and less than operators.  Here are some of the other common Python operators

### Comparison Operators

<pre>
==	Checks if the value of two operands are equal or not, if yes then condition becomes true.
!=	Checks if the value of two operands are equal or not, if values are not equal then condition becomes true
<>	Checks if the value of two operands are equal or not, if values are not equal then condition becomes true.
>=	Checks if the value of left operand is greater than or equal to the value of right operand, if yes then condition becomes true.
<=	Checks if the value of left operand is less than or equal to the value of right operand, if yes then condition becomes true.
</pre>

### Membership Operators

Python has membership operators, which test for membership in strings. There are two membership operators explained below:


<pre>
<b>in</b>	    Evaluates to true if it finds a variable in the specified sequence and false otherwise.	
<b>not in</b>	Evaluates to true if it does not finds a variable in the specified sequence and false otherwise.
</pre>

In [5]:
#!/usr/bin/env python

# Example 3.5
# Name: membership.py
# Description:  This program tests the membership operator

DNA = 'AGTTGTAATGAGGCTGCCGTGATA'

if ('T' in DNA) :
    print ('This is probably a piece of DNA since it contains thymidine')
elif ('U' in DNA) :
    print ('This is probably a piece of RNA since it contains uracil')

This is probably a piece of DNA since it contains thymidine


### Logical Operators

<pre>
<b>and</b> - Called Logical AND operator. If both the operands are true then then condition becomes true.
<b>or</b>  - Called Logical OR Operator. If any of the two operands are non zero then then condition becomes true.
<b>not</b> - Called Logical NOT Operator. Use to reverses the logical state of its operand. If a condition is true then Logical NOT operator will make false.
</pre>

In [6]:
#!/usr/bin/env python

# Example 3.6
# Name: logical.py
# Description:  This program tests logical operators

DNA = 'AGTTGTAATGAGGCTGCCGTGATA'

if ('A' in DNA and 'C' in DNA and 'G' in DNA and 'T' in DNA) :
    print ('This DNA sequence contains all 4 nucleotides')

This DNA sequence contains all 4 nucleotides


In [7]:
#!/usr/bin/env python

# Example 3.7
# DNA_equivalence.py
# A program that tests whether DNA fragments are identical

# DNA fragments
DNA1 = 'AGTTGTAATGAGGCTGCCGTGATA'
DNA2 = 'AGTTGTAATGAGGCTGCCGTGATA'
DNA3 = 'TCTTGTAATGAGCCTGCCGTGATT'

# Test to see if the DNA sequences are the same

if DNA1 == DNA2 :
    print ('DNA1 and DNA2 are the same \n%s\n%s\n' % (DNA1, DNA2))
else :
    print ('DNA1 and DNA2 are not the same \n%s\n%s\n' % (DNA1, DNA2))

if DNA2 == DNA3 :
    print ('DNA2 and DNA3 are the same \n%s\n%s\n' % (DNA2, DNA3))
else :
    print ('DNA2 and DNA3 are not the same \n%s\n%s\n' % (DNA2, DNA3))

DNA1 and DNA2 are the same 
AGTTGTAATGAGGCTGCCGTGATA
AGTTGTAATGAGGCTGCCGTGATA

DNA2 and DNA3 are not the same 
AGTTGTAATGAGGCTGCCGTGATA
TCTTGTAATGAGCCTGCCGTGATT



In [8]:
#!/usr/bin/env python

# Example 3.8
# DNA_site_conservation.py
# A program that tests whether the nucleotides at a particular site are DNA at a site are conserved

# DNA fragments
DNA1 = 'AGTTGTAATGAGGCTGCCGTGATA'
DNA2 = 'AGTTGTAATGAGGCTGCCGTGATA'
DNA3 = 'TGTTGTAATGAGCCTGCCGTGATT'
PATTERN = ''
# tests whether the nucleotides at a particular site are DNA at a site are conserved

if DNA1[0] != DNA2[0] :
    PATTERN = PATTERN + '*'
elif DNA1[0] != DNA3[0] :
    PATTERN = PATTERN + '*'
elif DNA2[0] != DNA3[0] :
    PATTERN = PATTERN + '*'
else :
    PATTERN = PATTERN + '.'

if DNA1[1] != DNA2[1] :
    PATTERN = PATTERN + '*'
elif DNA1[1] != DNA3[1] :
    PATTERN = PATTERN + '*'
elif DNA2[1] != DNA3[1] :
    PATTERN = PATTERN + '*'
else :
    PATTERN = PATTERN + '.'

print ('%s\n%s\n%s\n%s\n' % (DNA1, DNA2, DNA3, PATTERN))

AGTTGTAATGAGGCTGCCGTGATA
AGTTGTAATGAGGCTGCCGTGATA
TGTTGTAATGAGCCTGCCGTGATT
*.



Another way of accomplishing the same task

In [9]:
# Example 3.9
# DNA_site_conservation.py
# A program that tests whether the nucleotides at a particular site are DNA at a site are conserved

# DNA fragments
DNA1 = 'AGTTGTAATGAGGCTGCCGTGATA'
DNA2 = 'AGTTGTAATGAGGCTGCCGTGATA'
DNA3 = 'TGTTGTAATGAGCCTGCCGTGATT'
PATTERN = ''
# tests whether the nucleotides at a particular site are DNA at a site are conserved

if DNA1[0] != DNA2[0] or DNA1[0] != DNA3[0] :
    PATTERN = PATTERN + '*'
else :
    PATTERN = PATTERN + '.'
    
if DNA1[1] != DNA2[1] or DNA1[1] != DNA3[1] :
    PATTERN = PATTERN + '*'
else :
    PATTERN = PATTERN + '.'
    
# This is getting redundant.  Let's learn loops to finish this

print ('%s\n%s\n%s\n%s\n' % (DNA1, DNA2, DNA3, PATTERN))

AGTTGTAATGAGGCTGCCGTGATA
AGTTGTAATGAGGCTGCCGTGATA
TGTTGTAATGAGCCTGCCGTGATT
*.



## Exercises

Use these Drosophila alcohol dehydrogenase sequences https://moodle.umass.edu/mod/resource/view.php?id=1386275 from the McDonald and Kreitman paper for the following exercises.


1. Write a program to determine the total number of A, T, C and G in each sequence.

2. Write a program to determine if any of the six sequences are identifical.
 
3. For the 6 sequences determine the whether a site is polymorphic at each position and print the results at the bottom of the alignment (e.g. as in Examples 3.8 and 3.9).

* Next - <a href="http://nbviewer.ipython.org/github/jeffreyblanchard/EvoGenV5/blob/master/EvoGenV5_Lab4.ipynb">Session 4 : Detecting Selection in Strings with Loops</a>
* Previous - <a href="http://nbviewer.ipython.org/github/jeffreyblanchard/EvoGenV5/blob/master/EvoGenV5_Lab2.ipynb">Session 2 : Sequences and Strings</a> 