### Functions

Functions represent reusable blocks of code that you can reference by name and pass informatin into to customize the exectuion of the function, and receive a response representing the outcome of the defined code in the function. *When would you want to define a function?* You should consider defining a function when you find yourself entering very similar code to execute variations of the same process. The dataset used for the following example is part of the supplementary materials ([Data S1 - Egg Shape by Species](https://science.sciencemag.org/highwire/filestream/695981/field_highwire_adjunct_files/2/aaj1945_DataS1_Egg_shape_by_species_v2.xlsx)) for Stoddard et al. (2017). 

Mary Caswell Stoddard, Ee Hou Yong, Derya Akkaynak, Catherine Sheard, Joseph A. Tobias, L. Mahadevan. 2017. "Avian egg shape: Form, function, and evolution". Science. 23 June 2017. Vol. 356, Issue 6344. pp. 1249-1254. DOI: 10.1126/science.aaj1945. [https://science.sciencemag.org/content/356/6344/1249](https://science.sciencemag.org/content/356/6344/1249)

A sample workflow without functions:

In [67]:
# read data into a list of dictionaries
import csv

# create an empty list that will be filled with the rows of data from the CSV as dictionaries
csv_content = []

# open and loop through each line of the csv file to populate our data file
with open('aaj1945_DataS1_Egg_shape_by_species_v2.csv') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:             # process each row of the csv file
        csv_content.append(row)

print(csv_content[0].keys())
#print()
#print(csv_content[0])

odict_keys(['Order', 'Family', 'MVZDatabase', 'Species', 'Asymmetry', 'Ellipticity', 'AvgLength (cm)', 'Number of images', 'Number of eggs'])


In [68]:
# extract content of each "column" individually

order = []
for item in csv_content:
    try:
        order.append(item['Order'])
    except:
        order.append(None)

family = []
for item in csv_content:
    try:
        family.append(item['Family'])
    except:
        family.append(None)

species = []
for item in csv_content:
    try:
        species.append(item['Species'])
    except:
        species.append(None)

asymmetry = []
for item in csv_content:
    try:
        asymmetry.append(item['Asymmetry'])
    except:
        asymmetry.append(None)

ellipticity = []
for item in csv_content:
    try:
        ellipticity.append(item['Ellipticity'])
    except:
        ellipticity.append(None)

avgLength = []
for item in csv_content:
    try:
        avgLength.append(item['AvgLength (cm)'])
    except:
        avgLength.append(None)

noImages = []
for item in csv_content:
    try:
        noImages.append(item['Number of images'])
    except:
        noImages.append(None)

noEggs = []
for item in csv_content:
    try:
        noEggs.append(item['Number of eggs'])
    except:
        noEggs.append(None)

print(order[0:3])
print(family[0:3])
print(species[0:3])
print(asymmetry[0:3])
print(ellipticity[0:3])
print(avgLength[0:3])
print(noImages[0:3])
print(noEggs[0:3])   

['ACCIPITRIFORMES', 'ACCIPITRIFORMES', 'ACCIPITRIFORMES']
['Accipitridae', 'Accipitridae', 'Accipitridae']
['Accipiter badius', 'Accipiter cooperii', 'Accipiter gentilis']
['0.1378', '0.0937', '0.1114']
['0.3435', '0.2715', '0.3186']
['3.8642', '4.9008', '5.9863']
['1', '27', '7']
['2', '103', '18']


In [69]:
# define a function that can extract a named column from a named list of dictionaries
def extract_column(source_list, source_column):
    new_list = []
    for item in source_list:
        try:
            new_list.append(item[source_column])
        except:
            new_list.append(None)
    print(source_column + ": " + ", ".join(new_list[0:3]))
    return(new_list)
            
order = extract_column(csv_content, 'Order')
family = extract_column(csv_content, 'Family')
species = extract_column(csv_content, 'Species')
asymmetry = extract_column(csv_content, 'Asymmetry')
ellipticity = extract_column(csv_content, 'Ellipticity')
avgLength = extract_column(csv_content, 'AvgLength (cm)')
noImages = extract_column(csv_content, 'Number of images')
noEggs = extract_column(csv_content, 'Number of eggs')

print()
print(order[0:3])
print(family[0:3])
print(species[0:3])
print(asymmetry[0:3])
print(ellipticity[0:3])
print(avgLength[0:3])
print(noImages[0:3])
print(noEggs[0:3])
    

Order: ACCIPITRIFORMES, ACCIPITRIFORMES, ACCIPITRIFORMES
Family: Accipitridae, Accipitridae, Accipitridae
Species: Accipiter badius, Accipiter cooperii, Accipiter gentilis
Asymmetry: 0.1378, 0.0937, 0.1114
Ellipticity: 0.3435, 0.2715, 0.3186
AvgLength (cm): 3.8642, 4.9008, 5.9863
Number of images: 1, 27, 7
Number of eggs: 2, 103, 18

['ACCIPITRIFORMES', 'ACCIPITRIFORMES', 'ACCIPITRIFORMES']
['Accipitridae', 'Accipitridae', 'Accipitridae']
['Accipiter badius', 'Accipiter cooperii', 'Accipiter gentilis']
['0.1378', '0.0937', '0.1114']
['0.3435', '0.2715', '0.3186']
['3.8642', '4.9008', '5.9863']
['1', '27', '7']
['2', '103', '18']


In [70]:
# use the extract_column function in a loop to automatically extract all of the columns from a from the list
# of dictionaries to create a dictionary representing each column of values

columns = {}
for column in csv_content[0].keys():
    columns[column] = extract_column(csv_content, column)
columns

Order: ACCIPITRIFORMES, ACCIPITRIFORMES, ACCIPITRIFORMES
Family: Accipitridae, Accipitridae, Accipitridae
MVZDatabase: Accipiter badius, Accipiter cooperii, Accipiter gentilis
Species: Accipiter badius, Accipiter cooperii, Accipiter gentilis
Asymmetry: 0.1378, 0.0937, 0.1114
Ellipticity: 0.3435, 0.2715, 0.3186
AvgLength (cm): 3.8642, 4.9008, 5.9863
Number of images: 1, 27, 7
Number of eggs: 2, 103, 18


{'Order': ['ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ACCIPITRIFORMES',
  'ANSERIFORMES',
  'ANSERIFORMES',
  'ANSERIFORMES',
  'ANSERIFORMES',
  'ANSERIFORMES',
  'ANSERIFORMES',
  'ANSERIFORMES',
  'A

### Putting it all together

An example of reading a data file and doing basic work with it illustrates all of these concepts. This also illustrates the concept of writing a script that combines all of your commands into a file that can be run. [eggs.py](eggs.py) in this case. 

    #!/usr/bin/env python
    
    import csv
    
    # create an empty list that will be filled with the rows of data from the CSV as dictionaries
    csv_content = []
    
    # open and loop through each line of the csv file to populate our data file
    with open('aaj1945_DataS1_Egg_shape_by_species_v2.csv') as csv_file:
        csv_reader = csv.DictReader(csv_file)
        for row in csv_reader:             # process each row of the csv file
            csv_content.append(row)
    
    print("keys: " + ", ".join(csv_content[0].keys()))
    
    print()
    print()
    
    # define a function that can extract a named column from a named list of dictionaries
    def extract_column(source_list, source_column):
        new_list = []
        for item in source_list:
            try:
                new_list.append(item[source_column])
            except:
                new_list.append(None)
        print(source_column + ": " + ", ".join(new_list[0:3]))
        return(new_list)
                
    order = extract_column(csv_content, 'Order')
    family = extract_column(csv_content, 'Family')
    species = extract_column(csv_content, 'Species')
    asymmetry = extract_column(csv_content, 'Asymmetry')
    ellipticity = extract_column(csv_content, 'Ellipticity')
    avgLength = extract_column(csv_content, 'AvgLength (cm)')
    noImages = extract_column(csv_content, 'Number of images')
    noEggs = extract_column(csv_content, 'Number of eggs')
    
    print()
    print(order[0:3])
    print(family[0:3])
    print(species[0:3])
    print(asymmetry[0:3])
    print(ellipticity[0:3])
    print(avgLength[0:3])
    print(noImages[0:3])
    print(noEggs[0:3])
    
    # Calculate and print some statistics
    print()
    mean_asymmetry = sum(map(float, asymmetry))/len(asymmetry)
    print("Mean Asymmetry: ", str(mean_asymmetry))
    mean_ellipticity = sum(map(float, ellipticity))/len(ellipticity)
    print("Mean Ellipticity: ", str(mean_ellipticity))
    mean_avglength = sum(map(float, avgLength))/len(avgLength)
    print("Mean Average Length: ", str(mean_avglength))    print("Mean Average Length: ", str(mean_avglength))

To execute this script you can use a couple of strategies:

1. Run it using the python interpreter of your choice using the `python eggs.py` command at the command line
2. Run it using the python interpreter referenced in the `#!` line at the beginning of the script by making sure that the script is executable (`ls -l` can provide information about whether a file is executable, `chmod u+x eggs.py` can make your script executable for the user that owns the file), and entering the name of the script on the command line: `./eggs.py` if the script is in the current directory. 

    