# Introduction

The ability to write your own computer code to analyze data is a great skill for any scientist to have! By taking the time to write few lines of code, you can analyze experimental data, organize your files, automate repetitive number-crunching tasks, and create workflows that will be transparent, reproducible, and easily shared with colleagues and collaborators.

The goal of these workshops is to help you get your coding “feet” wet, by learning some basic Python programming skills. Then, you’ll move on to apply those skills to real image data sets.


##Why Python?

Python is a programming language that is commonly used in the natural sciences to create workflows and analyze data. Although you should use whatever programming language makes sense for your work, Python has several advantages:

1. Python is an open-source language that is freely available
2. Python runs on almost all computing platforms and environments
3. Python has a growing base of scientific users! There are numerous online resources and examples available
4. Python is relatively easy for new users to learn
5. Python has a variety of tools available for image processing and computer vision, including the [Scikit-Learn image processing library](https://scikit-learn.org/stable/).

---
> Learn the language that makes the most sense for you!
> 
> Although Python is commonly used in the natural sciences, different communities 
> of scientists may have a different “preferred” programming language. You should 
> learn and use the language that is common in your community, so that will be 
> easier to share and collaborate on code with your colleagues.
> 
> Fortunately, the basic concepts and ideas you are learning in Python will also 
> be present in another programming language!
> 
> Once you’ve learned one, picking up another language will be much easier.
---

##The examples

The best way to learn Python is to read, write, and experiment with Python code that solves real problems! To help you do this, we will focus on two image-based examples.

###Example 1: Bacteria growing on a plate

The images below show a number of bacterial colonies growing on plates. Imagine that you have conducted a large experiment, with multiple plates growing under different conditions. You might be interested in counting colonies on each plate, in computing the total percent area covered by the bacterial colonies, or even tracking colony growth over time. All of these tasks could be accomplished by a computer program! And once you have analyed your images to get colony counts or total area, you could write and use a second program to analyze the data (for example, to compare colony number and total growth at different timepoints or under different growth conditions).

<img src="https://eldoyle.github.io/PythonIntro/fig/00-colonies01.jpg" alt="Bacteria colony 1" style="float: left; margin-right:10px;"/>

<img src="https://eldoyle.github.io/PythonIntro/fig/00-colonies03.jpg" alt="Bacteria colony 2" style="float: left; margin-right:10px;"/>


###Example 2: Analyzing a titration

The video below (click on the image to see the video) shows the progress of a titration reaction. Rather than watch the reaction’s progress yourself, you could write a program to monitor the color of the solution, graph the change in color over time, and even send you a an alert notification when the reaction is complete!

<a href="https://www.youtube.com/watch?t=554&v=NLSY5S8CABk&feature=youtu.be"><img src="https://eldoyle.github.io/PythonIntro/fig/00-titration.jpg" alt="Titration video" style="float: left; margin-right: 10px;"></a>

##Image processing and data analysis in a single workflow

As you work through the Python Intro tutorials, you will encounter these examples at different points. At each point, you will be asked to apply the tools you have just learned to tackle analysis of one of these image-based data sets. For our purposes of learning basic Python skills, we will assume that you have already analyzed the images: that is, you have already counted colonies on a plate or measured solution color at different timepoints. This set of tutorials will help you build the skills to analyze the data collected from the images.

Later in the Image Processing tutorials, you will learn how to write code to do image processing: for example, to write code that will “look” at an image and count colonies, or write the code that will measure solution color every few seconds.

When put together, these two sets of skills will allow you to construct complete workflows that allow you to collect data and measurements directly from images and then output and analyze the data to answer experimental questions.

---
> How to use these lessons
>
> It would be impossible for us to cover every single programming or Python 
> concept that you might need to know. You also shouldn’t expect yourself to 
> remember every single thing that was covered. The goal of these lessons is to 
> expose you to some fundamental concepts of programming, using Python. Later 
> when you need to accomplish a task you can remember that you probably learned 
> something about it, and then you can go look up the details in your notes or 
> somewhere else on the Internet.
> 
> Just because we show you one way to do something doesn’t mean that’s the only 
> way, or even the best way, to do it! Good programmers are constantly learning 
> new ways to do things.
> 
> Don’t feel like you have to reinvent the wheel every time and write all of 
> your code from scratch. Someone has probably written code that is similar to 
> what you want to do, and it’s probably on the Internet. Just make sure you 
> understand what the code is doing before you copy it into your program!
> 
> Plate images are from the paper “High-throughput imaging of bacterial 
> colonies grown on filter plates with application to serum bactericidal assays.
> ” Liu X, Wang S, Sendi L, Caulfield MJ. J Immunological Methods, 2004.
---

#Variables

##Creating variables

If we want to, we can essentially use Python as a fancy calculator. Type the mathematical expression below into the *code cell* below, and press Shift+Enter to evaluate the expression:

```
5 + 7
```

In [None]:
# Enter the expression here, then press Shift+Enter


Hopefully the output of the expression you just typed was 

```
12
```

Here is another example: evaluate the expression

```
636 / 8
```

In [None]:
# Enter the expression here, then execute it


We should see the result of the division:

```
79.5
```

However, this isn’t very efficient. The real value of using Python for something like this is that we can store that value in a *variable* and use it again later.

Imagine that we have computed colony area in pixels from one of our plate images and we want to convert it to actual area in square mm. Based on the size properties of our image or some kind of scale marker, we have figured out the number of pixels that corresponds to a square mm of area. We can create variables for each of these values and use them to do calculations.

Type the commands below in the following code cell, and then execute them via Shift+Enter.

```
areaInPixels = 6929
mm2PerPixel = 0.0277
areaInPixels*mm2PerPixel
```

In [None]:
# Enter the three lines here, then execute


This will return the value of the last calculation:

```
191.9333
```

If we think that we will need to use this value later, we can also store it in a variable. Type this command into the following code cell, and execute it:

```
areaInMm = areaInPixels * mm2PerPixel
```

In [None]:
# Enter your code here, then execute


Notice that this doesn’t give us any output! If we want to print the value of `areaInMm` to the console, we can get it by simply executing a line with only the variable name:

```
areaInMm
```

Try that out in the cell below.


In [None]:
# Enter your code here, then execute


Now, you should see the value of the area in square mm, `191.9333`. 

##The assignment operator, `=`

A variable is just a name for a value, like `x` or `areaInMm`. We assign a variable name to a value by using the assignment operator, `=`. Unlike our understanding of this operator in mathematics, in Python `=` is not comparing the variable's value to some other value. Rather, `=` assigns a value to a variable. Later, when we use test for equality, we will use a different operator (`==`).

The variable name is just a convenient handle that we assign to, and can use to refer to, a value. You can change the value assigned to a variable at any time, just by reassigning it.

For example, let us change the number of square mm to a pixel by entering and executing this code:

```
mm2PerPixel = 0.03
mm2PerPixel
```

In [None]:
# Enter an execute the code here


This should return the output `0.03`, the new value of `mm2PerPixel`. 

What about the value of `areaInMm`? Enter just this variable name in the following cell and execute the cell to see its value.

In [None]:
# Enter and execute the code here


It should still have its original value, `191.9333`! 

The value for `mm2PerPixel` changed, but the value for `areaInMm` did not! If we imagine the variable as a sticky note with a name written on it, assignment is like putting the sticky note on a particular value.

This means that **assigning a value to one variable does not change the values of other variables**. Since `areaInMm` doesn’t remember where it came from, it isn’t automatically updated when `mm2PerPixel` changes. This is different from the way spreadsheets work.

##Mathematical operations

Python uses all of the usual mathematical operators, and a couple of unusual ones, to perform operations on numbers and variables. These are:

* `+` (addition)
* `*` (multiplication)
* `/` (division)
* `**` (exponentiation)
* `%` (modulus (pronounced “mod”) , which returns the remainder from division)

Execute the following pre-filled cells to see a few of these in action.

In [None]:
# Execute this cell
areaSquared = areaInMm ** 2
areaSquared

In [None]:
# Execute this cell
remainder = 11 % 2
remainder

The order of operation for mathematical operators is the same as you learned in mathematics. Also, as in our mathematics training, we can override the order of operation via parentheses: `( )`.

##Different types of variables store different types of data

Python lets us make different “types” of variables. So far you have seen ints (short for integers, numbers without decimal points) and floats (floating point numbers, numbers with decimal points). Python also allows you to make variables that are strings (strings of text, very useful in Biology when dealing with sequence data) and other types of variables that we will introduce you to later.

Execute the following cell to create two string variables.

In [None]:
# Execute this cell
name = 'Erin'
word = 'image'

Some operators that we've seen for numbers also work with strings. For example, for strings, `+` means concatenation, joining two strings together. And, for strings, `*` means repetition.

In the following cell, try this expression to test out concatenation.

```
name + word
```

In [None]:
# Enter your code here and execute it


Now, in the following cell, try this expression to test out repetition.

```
name * 2
```

In [None]:
# Enter your code here and execute it


However, not all operations are supported for all combinations of variables. For instance, "adding" a string and a number doesn't make sense. Execute the following cell to see what happens when we tell Python to execute an expression that doesn't make sense in this way.

In [None]:
# Execute this cell
name + remainder

Since Python doesn’t know exactly how to add a word to a number, it returns an error message. Error messages are usually very descriptive (albeit cryptic to beginners), and can tell you a lot about problems in your code!

##Good practices - variable names

Variables should be given descriptive names. Python requires that variable names must begin with a letter. They are also case sensitive (`Text`, `text`, and `tEXT` are all different variables). Beyond this, Python doesn’t care what you call your variables. You could name them all after different kinds of fruit if you want! However, this would be very confusing when you went back later to look at your code.

You should do your best to give your variables descriptive, useful names that describe the data they hold. If you cannot come up with an appropriate variable name, it may mean that you don't yet understand how to solve the problem at hand!

---
> DIVAS variable name conventions
>
> In the DIVAS project, variables should start with a lowercase letter, and 
> capitalize the first letter of each successive word. This is known as“camel case”.
>
> For example, `plateArea`, or `colonyCount`.
---

#The `print()` function

So far we have seen that if we have a single expression or variable in a code cell, and then execute the cell, the value of that expression or variable is displayed as a result. Often, however, we would like to see more than a single value, and / or show some descriptive text along with the value to make it easier for humans to read the output.

That is where the Python `print()` function can come in handy. This function allows one or more values to be displayed when the function is executed. For example, `print(areaInMm)` displays the value of the variable `areaInMm`. 

When we wish to print more than one thing on the same line, we can separate the values we want to print by commas, like this: `print('Area in mm^2:', areaInMm)`. This would be good practice, as it labels the output so that a human reader can correctly interpret the output. 

---
> Grood practice: Write your code for a human, not a computer!
>
> Python doesn’t care about picking good variable names, whether or not you 
> break up chunks of code with whitespace, or whether or not you include 
> helpful notes to yourself or other users. However, all of these things will 
> make a difference to a human reader. You should always write your code for a 
> human audience- specifically yourself or someone else who might pick up your 
> project months or years from now and need to figure out what your code 
> actually does.
> 
> To improve the readability and usefulness of your code, you should
> 
> 1. Use good variable names (see the previous lesson for examples).
> 2. Break up long pieces of code logically, using empty lines (whitespace). 
> Try to break up your code into chunks based on function.
> 3. Include comments. The `#` symbol indicates that everything that comes 
> after it on a line of code is a comment. The comment will be ignored by 
> Python, but it can provide useful information to future users of your code 
> (including you!). At a minimum, you should include
> 
>   a. A comment line at the beginning of each section of code explaining what 
> the section is doing
>
>   b. Comments at the end of lines if the line of code does something unusual or interesting
> 
>    c. Put the comment on the line above if it is long 
> 
> Anything that will be important to remember later should be commented! You 
> will not remember why you made that choice three days, three months, or three 
> years from now!
> 
> Python also allows you to include longer comments a block comments (such as 
> the block comment automatically generated by Spyder at the top of a script). 
> Block comments are denoted by three double quotes at the beginning of the
> block comment and three double quotes at the end.
> 
> ```
> """
> This is an example of a block comment.
> This is the second line of the comment.
> """
> ```
> 
> At minimum, include a block comment at the beginning of your Jupyter notebook
> explaining the name, usage, and the purpose/goal of the script.

#Exercises

---
> Exercise 1
> 
> Suppose that a typical Petri dish has a diameter of 90mm. Write Python code
> in the cell below to calculate the surface area of the bottom of the Petri 
> dish, and save the value in a variable called `surfaceArea`.
--- 

In [None]:
# Write code to calculate the surface area of a 90mm Petri dish


---
> Exercise 2
>
> Suppose Petri dish 1 has bacterial colonies covering 1416.85 mm<sup>2</sup>
> of the dish, and that dish 2 has colonies covering 829.58 mm<sup>2</sup>.
> Write Python code to compute the percentage of the surface area covered by 
> colonies in each dish, saving the results in variables named `dish1Percent`
> and `dish2Percent`, respectively. Then, print the results, one per line, with 
> descriptive text.
---

In [None]:
# Write code to calculate the percentage covered for each dish


---
> Exercise 3
> 
> Given the two variables created in the cell below, write code to "switch" 
> their values, so that `density1` refers to 3.28 and `density2` refers to 4.59.
> Verify the results by printing the values of both variables, along with 
> descriptive text.
---

In [None]:
# Execute (but do not modify) this cell
density1 = 4.59
density2 = 3.28

In [None]:
# Write code to switch the values of the variables
