# Importing functions and data

Sometimes you want to perform an action or analysis on some data you have that many other people may have had to perform in the past.  In lots of cases, other people have already written fuctions do do exactly what you are looking to do.  The great thing about python is that you can simply import these functions by importing python libraries, and using the functions contained in those libraries.  

Lots of really smart people have written really cool libraries, and one of the rules of good programming is to avoid reinventing the wheel.

Let's begin by importing a module that is contained in Python's standard library.  The standard library is a collection of useful, curated modules that come standard with most installations of Python.

## Import statements

Import statements are usually found at the beginning of a program, and need to be executed BEFORE any of the module's functions are used in the body of the code.

The module we will attempt to use is the `time` module:

In [2]:
import time

Now that we've imported the `time` module, we can use the functions defined in that module by specifying where that function 'lives'

For example, the `time` module contains a function called `sleep()` which simply causes the program to pause for a defined number of seconds before continuing down the program.  

Try it out!

In [3]:
# without sleeping

print('begin program.')
x = 'hello world!'
print(x)
print('end program.')

begin program.
hello world!
end program.


In [4]:
# with sleeping

print('begin program.')
x = 'hello world!'
print(x)
time.sleep(5)  # program should wait 5 seconds before continuing
print('end program.')

begin program.
hello world!
end program.


Notice that in order to call the sleep function we had to reference the module beforehand using a '`.`' between; simply calling the `sleep()` function without telling the computer where to find that function results in an error:


In [5]:
# with sleeping

print('begin program.')
x = 'hello world!'
print(x)
sleep(5)  # program should wait 5 seconds before continuing
print('end program.')

begin program.
hello world!


NameError: name 'sleep' is not defined

The computer is telling us that we have not defined anything using the name 'sleep'!

## the random module 

In [7]:
import random

The random module contains a function called `randint()` that returns a random integer between the two values you provide as input.

In [13]:
## Generate a random integer between 0 and 10

x = random.randint(0,10)
print(x)

7


Just for fun, run the above cell a few times, and notice you get different values each time you run it!  Cool, huh?

## Import custom functions

Not only can you import functions and modules in the standard library, but you can get modules from **many** other places, including your local folders.

Remember before the break we wrote a function to look for start codons?  We actually saved that function to a file called `dna_analyze.py`.  Python has the ability to import python functions using the following syntax:

In [26]:
from dna_analyze import dna_analyze

dna_analyze('AAAAAAAAAAAAAAAAAA')

The length of dna1 is 18 basepairs.
No start codon detected.


## MUTATIONS!!!
One of the ways organisms gain new phenotypes is through mutation.  Mutations are random, and these mutations, if not selected against, can accumulate over generations.

How would you go about modeling something that is random, and causes changes in a DNA sequence...?

Think back to lesson2.0

**Let's write a function to simulate random mutations!!**

In [21]:
%%writefile mutate.py

## Start with our original gene

dna1 = "TTTATGCCC"

def mutate(seq):
    # make our input sequence a list
    mutated_seq = list(seq)
    # find the length of our input sequence
    seqlen = len(mutated_seq)
    # Choose a random nucleotide to change
    mutation_location = random.randint(0,seqlen - 1)
    ## we need a list of new base options to choose from
    base_options = ['A','T','C','G']
    ## randomly choose one of our base options
    new_base = base_options[random.randint(0,3)]
    ## replace the selected mutation location with the randomly chosen base
    mutated_seq[mutation_location] = new_base
    ## join the list back into a string, and return that string
    return ''.join(mutated_seq)
    
dna2 = mutate(dna1)
print(dna2)

TTTGTGCCC


Great!  Now we have a function that will randomly select a nucleotide within a given sequence, and replace that nucelotide with a randomly selected new one.

**Biology alert**: "In general, the mutation rate in unicellular eukaryotes and bacteria is roughly 0.003 mutations per genome per cell generation.[10] This means that a human genome accumulates around 64 new mutations per generation because each full generation involves a number of cell divisions to generate gametes"  -wikipedia


In [35]:
## use your dna_analyze() function to compare the new and the original dna sequence, to the mutated one.



Crazy thing about changes to the genetic code:  They add up over time and generations.
So now what we need is a way to allow for multiple mutations, over a number of generations


In [43]:
def genetic_drift(seq, generations):
    for generation in range(generations):
        seq = mutate(seq)
        # can you make this function display changes in each generation?
    return seq

In [44]:
genetic_drift('AAAAA',100)

'CTTAG'