# Lab 1

## Part 2 - Getting to know python

So far, we have covered the bash basics. This gives you enough information to be able to get around in these notebooks, as well as on the command line (particularly when you complete the homework task). Now, we are going to have a look at python - a different programming language and one that we will be focusing on heavily in the first half of the course.

You can do a lot of things with python, as it is a very powerful langauge. Python comes pre-built with base library, which are repositories of functions that you can use to do your work. If that doesn't mean much to you right now, it will do as we move on through the semester.

The first and most important thing you need to know about python is that if you want to use any of these functions (and you do, why would you reinvent the wheel?) you need to **import** them first. In the next code chunk, we are going to import a library called **sys** which, as the name suggests, contains many functions that allow you to perform system specific tasks, such as finding your current working directory, or figuring out where your python executable file is. In fact, that's exactly what the code chunk below does.

In [None]:
import sys
sys.executable

What you might have noticed from the above code chunk, is that we first imported the sys library using:

```python
import sys
```

We then asked python to tell us where the python executable file is (like an `.exe` file on a windows system, or a `.app` on a Mac system). We did that with this line:

```python
sys.executable
```

Note that **sys** is the package and **executable** is the function contained in that library. We need to use this way of invoking library-based functions because if we had just used the command:

```python
executable
```

python wouldn't have known what you meant.

So always remember: `library.function`

### Custom libraries

Unfortunately, base python doesn't contain a lot of what we need to do our bioinformatics work. Why would it? Python is used the world over for many things, and biology is more of a niche use case. Luckily, the bio community have created a library called **BioPython** that will allow you to access functions that are more applicable to the work that you are doing. And, we have installed it for you!

The next code block loads and tests some functionality provided by the BioPython library. 

In [None]:
from Bio.Seq import Seq
my_seq = Seq("AGTACACTGGT")
complemented_seq = my_seq.complement()
str(complemented_seq)

Let's go through this line by line.

```python
from Bio.Seq import Seq
```

Here, we imported the library `Seq` from `Bio.Seq` which is part of BioPython.

```python
my_seq = Seq("AGTACACTGGT")
```

This will be tricky to wrap your head around if you are brand new to programming. Here, we have declared a variable called `my_seq`. I like to think of variables like buckets - you can call them whatever you want, and you can put whatever you want inside of them. You can also change what's in them.

In this case, we have specified a string of nucleotides, and we have wrapped it inside a call to `Seq`. `Seq` is a class object - objects have very specific definitions. In this case, it should be obvious that we have taken our string of nucleotides and converted it to an object of type `Seq`. This allows us further access to all functions that `Seq` objects have available to them. We can now access them simply using our variable name, `my_seq`. 

```python
complemented_seq = my_seq.complement()
```

Now, we move onto doing what we actually wanted to do, and that is to complement our sequence. We could have done this by hand, but imagine you are working with large FASTA files - you definitely want an easier way of doing this! Now that our sequence has been converted to type `Seq`, we can access the inbuilt functions. One of them is called `complement`.  We store our complemented sequence into a variable called `complemented_seq`

```python
str(complemented_seq)
```

Once we have complemented our sequence, we want to convert it back to a simple string. We can do that using the type casting function `str()`. The consequence of performing this type casting and **not** assigning it to a variable means it is just going to print it out to the screen.

### Finding out more information about functions

Remember how we used `man` in bash to find out information about commands? You can do the same thing in python. The syntax we use is

```python
?library.function
```

See it in action below. Run the code chunk to see the information for the `Seq.complement` function.

In [None]:
?Seq.complement

### Python operators

You can do even more in python. Operators are a huge part of programming, and there are lots to choose from. We won't go over all of them now, but this link is a good one to bookmark for later: https://www.w3schools.com/python/python_operators.asp

For now, I want to focus on the **addition** (`+`) operator, as it is a special case in python. 

In [None]:
print(10 + 100)

This is pretty straightforward. I have used the `print()` command just to show you that it exists - in these jupyter notebooks, you usually don't need to use it. It's just like `echo` in bash. 

However, the `+` operator, in python, works on more than just numbers. You can use it to concatenate two strings together.

In [None]:
this_string = "ATT" + "TAG"
print(this_string)

Keep this in mind as we move to part 3.

## Go to Part 3 of the lab

You are ready to take on part 2 of the lab - getting to know python. 

## [Click here to go to Lab 1 Part 3](lab01_part3_R.ipynb)