# Computational Skills for Biocuration

## Programming Skills with Python

### Databases and Python

We've learned a lot over the past few days. On days 1 & 2, we introduced some of the fundamental concepts in programming with Python, such as

- variables
- strings
- lists
- dictionaries
- defining our own functions
- conditionals (`if`, `elif`, `else`, etc)

before moving on to look at how we can use Python to download and work with data from online databases/resources.

Yesterday, we were introduced to the fundamental considerations associated with designing our own local database(s) and some SQL commands to read and write information from/to these.

Today, we're going to wrap things up by __combining Python and SQLite__ to show you how you can programmatically create, extend, edit, search, and extract information from local database files.

We will also try to conclude the training by showing you how to combine the elements of Python programming that we've been discussing, into a single, reusable, _script_.

#### Introducing the `sqlite` module

Just like we needed to import `requests` to interact with online databases, and `json` to work with data in JSON format, we need to use the `sqlite3` module to work with the kind of SQLite databases that you were introduced to yesterday.

##### Exercise

Given the list of UniProt IDs, `id_list`, and using the `annotation` table as above, write a loop to create a dictionary with these IDs as keys and GO term as the values.

__Reminder:__

- A dictionary is created with `{}` or the `dict()` function
- To add an entry to a dictionary, use `dictname[key] = value`, where `key` and `value` are the variables/values that you want to store as the key and value respectively

In [None]:
id_list = ['Q9NR21', 'Q9H339', 'Q969R5', 
           'Q9Y6T7', 'Q9UPI3', 'Q9NZL6', 
           'Q96RU2', 'Q9NQA3', 'Q9H9P8', 
           'Q9NY57', 'Q96HL8', 'Q9HCJ6', 
           'Q9Y6U3', 'Q9UJS0', 'Q9NX62'
]

#### Packaging into a function

##### Key points

- `sqlite3` provides us with functionality for working with database files
- use `sqlite3.connect()` to connect with a database
- and a _cursor_ object to execute SQL commands from within Python
- make sure to _commit_ your changes to save them to the database
- use `cursor.fetchone()` and `cursor.fetchall()` to collect the results of a selection from a table
- __always__ close the database connection when you're done


#### Combining remote & local databases (AKA: API SQL LOL WTF)

##### Further reading

- the [official documentation for `sqlite3`](https://docs.python.org/3.7/library/sqlite3.html) is very helpful but also very long!
- I recommend [this post](https://www.pythoncentral.io/introduction-to-sqlite-in-python/) to learn more about using the `sqlite3` module in Python.
- the SQLite tutorial yesterday mentioned _regular expressions_ (regex) as a sophisticated tool for matching patterns in text. There's no need to know more about regex for this course but, if you would like to learn more about how to use this powerful approach, the EMBL Bio-IT Project [has introductory online course material](https://tobyhodges.gitbooks.io/introduction-to-regular-expressions/).