# Introductory python: ad-hoc usage

This  is  an  introductory  workshop  for  python  programming.   The  aim  is  to  familiarize  users  with
the basics of python for general bioinformatics usage (format wrangling).  The workshop assumes users already know basic
programming concepts such as loops, conditionals and functions.  Be aware that this is an ad-hoc workshop in the sense
that it doesn’t go into the details of python as an object-oriented programming language.  There are plenty
of resources available online for in-depth learning of classes, methods and other related topics, and I will
not attempt to cover them here.

As  an  extra,  there  will  be  an  introduction  to
**conda** and **Jupyter**. Both  tools  are  widely  used  for
reproducible and manageable bioinformatics projects.

## 0 General syntax, types and methods

In [1]:
message = "Hello world!"

In [2]:
print( message )

Hello world!


### 0.0 Objects


Let's begin with the most common types of objects

In [6]:
myInteger = 1
myString = "bananaz"
myFloat = 2.71828
myList = [1,2,"cat",4,5,6,7,8]
myTuple = (1, "pineapple")
myDictionary = { "apple" : 2,
                 "pear" : 3,
                 "grapes" : 1}

Let's see the content of the variables.

In [4]:
myInteger

1

In [5]:
myString

'bananaz'

In [7]:
myFloat

2.71828

In [8]:
myList

[1, 2, 'cat', 4, 5, 6, 7, 8]

In [9]:
myTuple

(1, 'pineapple')

In [23]:
myDictionary



str

* Try to define your own set of similar objects (follow the syntax!!)


* Write, for example type(myInteger). Do it for the other objects.

The objects' names are pretty self explanatory, but let's review them:


* **myInteger** is an integer object.
* **myString** is a string object.
* **myFloat** is a double-precision floating point number (best approximation for a non integer real).
* **myList** is a list.  Think of it as a vector.  It can contain any kind of object inside it (even other lists).
* **myTuple** is a tuple.  These are similar to lists, the key point is that tuples cannot be modified.
* **myDictionary** is a dictionary.  They are the python implementation of a hash.  We’ll go into them soon


Integers and floats behave as you would expect them to. We will go into detail on how the other types work.

#### 0.0.0 Strings


Strings are a concatenation of characters.  They are defined when you enclose a sequence of characters in
quotes.

Strings can be sliced and indexed to return substrings.  Keep in mind that python
uses 0-based indexing, which means that the first item of a string will be the zeroth item.  Let’s see how
this works, run:


In [11]:
myString

'bananaz'

In [14]:
myString[0] # get the first letter

'b'

In [15]:
myString[3] # get the fourth letter

'a'

In [18]:
# A useful trick to access the last letter
myString[-1]

'z'

In [19]:
# What about slicing a string?
myString

'bananaz'

In [21]:
myString[0:3]

'ban'

Strings are **inmmutable**, they cannot be modified. If we want to assign a value to one of it's indices:

In [22]:
myString[-1] = "s"

TypeError: 'str' object does not support item assignment

* Try to reason how python handles the slicing with respect to the coordinates that you give to the indexing operator.
* Slice your string into different substrings. Did you get the output that you expected?

**String operators**


Strings can be concatenated or repeated with the `+` and `*` operators respectively.

In [24]:
s = "A " + "weird " + "monkey " 

In [25]:
s

'A weird monkey '

In [26]:
s*5

'A weird monkey A weird monkey A weird monkey A weird monkey A weird monkey '

#### 0.0.1 Integer and double arithmetic


The implementation of arithmetic operators in python is as follows:

* `+` for summation
* `-` for substraction
* `*` for multiplication
* `/` for division
* `**` for exponentiation
* `%` for modulo

In [29]:
1 + 1

2

In [30]:
2 - 3

-1

In [31]:
2 * 4

8

In [32]:
1 / 2

0.5

In [None]:
2 ** 10

In [27]:
10 % 7

3

In [28]:
80 % 2

0

If we want to perform more complex operations that require functions, we resort to the `math` module.

In [34]:
import math
math.log(1)


0.0

We will look into modules later.

#### 0.0.2 Lists


Lists can contain several types of objects, including lists themselves. They behave similarly to lists with respect to indexing and slicing:

In [35]:
newList = ["a","b", "c", myList , "monkey"]

Let's see its contents.

In [36]:
newList

['a', 'b', 'c', [1, 2, 'cat', 4, 5, 6, 7, 8], 'monkey']

In [37]:
newList[0]

'a'

In [39]:
newList[3:5]

[[1, 2, 'cat', 4, 5, 6, 7, 8], 'monkey']

In this case, `newList` contains a **nested** list in the fourth entry. The syntax to access items in nested lists would be:

`list_var[top_level_index][nested_index]`

* Access the nested values of newList


Unlike strings or tuples, lists can be modified. One way of doing that is assigning new values to particular entries:

In [41]:
newList[3] = "d"

In [42]:
newList

['a', 'b', 'c', 'd', 'monkey']

#### 0.0.3 Dictionaries

Dictionaries are a useful way to handle data in python. They are the python implementation of a hash function, which consists in mapping a **unique** key to data of any kind. We call this pair of related information **key-value** pairs, where the unique key is associated with any kind of value.

Let's see how it works by examples. One way to initialize python dictionaries is:

In [47]:
myDictionary = { "apple" : 2,
                 "pear" : 3,
                 "grapes" : 1 }

In [48]:
myDictionary

{'apple': 2, 'pear': 3, 'grapes': 1}

* This assigns the integer 2 value to the key "apple", and so on. Notice that the keys are unique. What happens if we initialize the dictionary with non-unique keys?

We can easily access each value by providing the corresponding key:

In [50]:
myDictionary["grapes"]

1

In [51]:
myDictionary["pear"]

3

The previous kind of initialization is useful when we have a low number of key-value pairs, but what if we wanted, for example, to map thousands of SNPs to their respective chromosomal positions? Another way to initialize a dictionary and update it would be:

In [52]:
genome_dict = dict() # Create an empty dictionary

In [53]:
genome_dict = {} # Works the same way

In [54]:
genome_dict["BovineHD4100000577"] =  98367573
genome_dict["BovineHD4100000819"] = 144587013

In [55]:
genome_dict

{'BovineHD4100000577': 98367573, 'BovineHD4100000819': 144587013}

* This behaviour works for any kind of initialized dictionary. Add more fruits to myDictionary.

#### 0.0.4 Tuples


Tuples are **immutable** sequences of values. They can be indexed and sliced like strings and lists, but cannot be modified.

In [56]:
t = 1,2,3,4,"five"
t

(1, 2, 3, 4, 'five')

In [57]:
t = (1,2,3,4,"five")
t

(1, 2, 3, 4, 'five')

In [58]:
t[-1]

'five'

In [59]:
t[0]

1

In [60]:
t[1:2]

(2,)

In [61]:
t[0] = "one"

TypeError: 'tuple' object does not support item assignment

#### 0.0.5 Functions

Functions are a piece of code written to perform a particular action. 

Let's see the basic syntax writing a function that returns the first letter of an input string:

In [62]:
def first( s ):
    
    return s[0]
    

In [63]:
myString

'bananaz'

In [64]:
first_letter = first( myString )

In [65]:
first_letter

'b'

The basic syntax should follow this overall structure:

In [68]:
def function(  parameter  ):

        #perform an action

        return value # this is optional

The `return` statement is optional, because sometimes we want a function to only perform an action instead of returning a value. Let's see some examples.

In [69]:
def print_first( s ):
    
    print( s[0] )

In [70]:
first_letter = print_first( myString )

b


In [71]:
first_letter

In [72]:
print(first_letter)

None


In [73]:
type(first_letter)

NoneType

In this case, the `print_first()` function only printed the first character of the input string without returning a value, which is the reason why `first_letter` contains a None value.

Let's see another, more realistic example. We will create a function that takes a list and a value as input, and it will modify the list reassigning the value to its third entry.

In [74]:
def modify_list( l , v ):

    l[2] = v


In [75]:
myList

[1, 2, 'cat', 4, 5, 6, 7, 8]

In [76]:
modify_list( myList,  3 )

In [77]:
myList

[1, 2, 3, 4, 5, 6, 7, 8]

As we can see, `modify_list()` modified the input list without returning any values.

##### Some example built-in functions


Some example functions that we will use in the following sections are:

* `len()` which returns the length of an input string, list, tuple or dictionary.
* `range(n,m)` creates a sequence of numbers from `n` to `m-1`.

In [78]:
len(myString)

7

In [79]:
len(myList)

8

### 0.1 Loops and control flow

#### 0.1.0 Boolean expressions

Boolean expressions are used for produciong boolean outputs (True or False).


Let's begin with inequalities:

In [81]:
a = 6
b = 2
c = 6

In [82]:
a == c

True

In [86]:
a == b

False

In [87]:
a != b

True

In [88]:
a > b

True

In [89]:
a > c

False

In [None]:
a >= c

And logical operators:

In [90]:
a = 6
b = 2
c = 6

In [91]:
a == c and a > b 

True

In [92]:
a == b and a > b

False

In [93]:
a == b or a > b

True

In [94]:
not a == c 

False

In [97]:
not a == c or a == c 

True

In [98]:
not ( a == c or a == c )

False

`in` is a useful operator to check for membership:

In [99]:
myList

[1, 2, 3, 4, 5, 6, 7, 8]

In [100]:
"cat" in myList

False

In [101]:
2 in myList

True

In [102]:
myString

'bananaz'

In [103]:
"b" in myString

True

In [104]:
"j" in myString

False

#### 0.1.1 Loops

Let's see how a simple python for loop works. Lets loop through the letters of `myString` and `myList` and print them.

In [105]:
for i in myString:
    print(i)

b
a
n
a
n
a
z


In [106]:
for i in myList:
    print(i)

1
2
3
4
5
6
7
8


Python for loops are implicit in their index handling of strings and lists. One could read these loops as "for every item in my object..."

What if we wanted to just print the integers from 0 to 10? We would need to resort to the `range()` function.

In [108]:
for i in range(0,11):
    print(i)

0
1
2
3
4
5
6
7
8
9
10


If we want a similar behaviour to R for printing the items in `myList`(avoid if possible):

In [109]:
for i in range( 0,len(myList) ):    # len() returns the length of an object
    print( myList[i] )

1
2
3
4
5
6
7
8


Let's see the behaviour of for loops on dictionaries:

In [116]:
for key in myDictionary:
    print(key)

apple
pear
grapes


In [117]:
for key in myDictionary:
    print( myDictionary[key] )

2
3
1


As dictionaries use a hash function to store values, the order of which the items are printed seem random.

#### 0.1.2 Control flow

Let's see the syntax of conditional statements. We'll print all **even** numbers from 1 to 10.

In [118]:
for i in range(1,11):
    if i % 2 == 0 :               # % is the modulo operator
        print(i)

2
4
6
8
10


Let's slightly modify our code to report on odd numbers:

In [119]:
for i in range(1,11):
    if i % 2 == 0 :               
        print(i)
    else:
        print("ODD!")

ODD!
2
ODD!
4
ODD!
6
ODD!
8
ODD!
10


Let's further modify our code to introduce elif statements. We'll print numbers from 1 to 20, but if the number is a multiple of 3 or 7, well print `PUM!`

In [120]:
for i in range(1,21):
    if i % 3 == 0:
        print("PUM!")
    elif i % 7 == 0:
        print("PUM!")
    else:
        print(i)

1
2
PUM!
4
5
PUM!
PUM!
8
PUM!
10
11
PUM!
13
PUM!
PUM!
16
17
PUM!
19
20


We could reduce the above code using logical operators, as the action followed for multiples of 3 and 7 is the same:

In [123]:
for i in range(1,21):
    if i % 3 == 0 or i % 7 == 0:
        print("PUM!")
    else:
        print(i)

1
2
PUM!
4
5
PUM!
PUM!
8
PUM!
10
11
PUM!
13
PUM!
PUM!
16
17
PUM!
19
20


* Using the `myDictionary` dictionary, print the name of keys whose values are greater than 2.
* Using myList, print all entries which are of type str (string)


**Important note**: Python is *very* strict with indentation. Try writing a poorly indented for loop to see what happens.


### 0.2 Methods

We have already seen some python functions such as `print()` , `len()` and `range()`. Methods are similar to functions but have some extra properties (not all listed):

1. They depend on their association with objects
2. They may not return any value


We'll learn a few of the most used methods.


The general syntax for methods is:

`object.method( arguments )`

#### 0.2.0 String methods

String methods are for manipulating strings. They all generate new values (they don't modify its input).

##### The `strip()` method removes any trailing whitespace from a string.

This is particularly useful when dealing with files. Let's see a simple example

In [None]:
s = "fire coming out of a monkey's head\n\n\n\n\n"
r = "water it!"
s

In [None]:
print(s)
print(r)

In [None]:
print( s.strip() )
print(r)

##### The startswith() method returns a boolean wether the first character(s) of the input string matches a specified character

Syntax:

`string.startswith(substring)`


In [None]:
fruit = "banana"

In [None]:
fruit.startswith("app")

In [None]:
fruit.startswith("b")

##### The `split()` method splits the string at a specified character, and returns a list.


This method is really useful for handling character-delimited tables. The syntax would be:

`string.split(delimiter, max)` where max is the maximium number for splitting (default would be -1, i.e. all occurrences)



In [None]:
grocery = "banana, apple, cheese, milk, fishing rod"

grocery.split(",")

* What is the default value for delimiter?
* Say you read a plink map file whose lines look like the code below. Read the line into a dictionary.

In [None]:
map_line = "10 ARS-BFGL-BAC-10960 0 20776707" # chromosome, snp id, centimorgan, position

##### The `format()` method is for creating strings using predefined variables:

In [125]:
today = "Monday"


"Today is {0}".format(  today  )

'Today is Monday'

In [128]:
tomorrow = "Tuesday"


"Today is {0}, tomorrow is {1}".format( today, tomorrow )

'Today is Monday, tomorrow is Tuesday'

An alternative way to do this, would be using the string `+` operator, wich concatenates strings:

In [129]:
"Today is " + today + ", tomorrow is " + tomorrow

'Today is Monday, tomorrow is Tuesday'

The preferred usage depends on the situation, but using the `format()` method improves code readability.

* Try using both methods to create a string from `myList`. (Hint: use the `str()` function to convert an integer into a string)

##### The `replace()` method returns a string where the specified value has been replaced with another specified value.

`string.replace("old", "new", count)`

In [None]:
s = "I'd like to pet my dog right now"
s

In [None]:
s.replace( "dog", "cat" )

* Try using the "count" option with this new string:

In [None]:
s = "I'd like to pet my dog right now. My dog is amazing"

* You can use the `replace()` method to delete (instead of replace) parts of a string. Figure out how to use it that way.

* There are plenty more string methods. Search the web for other string methods and put one of them to use.

#### 0.2.1 List methods

List methods can modify existing lists or return values. Let's see the most commonly used methods.

##### The `append()` method adds an element to the end of a list

In [None]:
myList

In [None]:
myList.append("dog")

In [None]:
myList

In [None]:
myList.append(10)

In [None]:
myList

In [None]:
myList.append(    ["apple", "banana"]   )

In [None]:
myList

##### The `extend()` method adds the element of a list to another list

In [None]:
myIntegerList = [ 12 , 13 , 14 , 15 ]
myList.extend( myIntegerList )

In [None]:
myList

##### The `insert()` method inserts elements into a list in a specified position

In [None]:
abc = ["a", "c", "d", "e", "f", "g"]

In [None]:
abc.insert(1,"b")

In [None]:
abc

##### The `pop()`  method removes an element from a list at a specified position

In [None]:
abc.pop(-1)

In [None]:
abc

In [None]:
abc.pop(2)

In [None]:
abc

* As you may have realized, `pop()` modifies the list given as input and returns the removed value. Write a one-liner using `insert()` and `pop()` to complete the `vowels` list using values from `abc`.

In [None]:
vowels = [ "a", "i", "o", "u" ]

In [None]:
vowels.insert( 1, abc.pop(3) )

In [None]:
vowels

##### The `remove()` method removes an element of specified value from a list


`remove()` takes a specified value from the list and modifies it withou returning any values.

In [None]:
abc

In [None]:
abc.remove("f")

In [None]:
abc

What if we provide a value that is not on the list?

In [None]:
abc.remove("e")

##### The `reverse()` method reverses the order of an input list

In [None]:
abc

In [None]:
abc.reverse()

In [None]:
abc

##### The `sort()` method sorts an input list


`sort()` can take some extra options such as the order of the sortering, and particular ordering functions.

In [None]:
abc = ["a", "c", "e", "k" ,"b"]

In [None]:
abc.sort()

In [None]:
abc

In [None]:
abc.sort(reverse = True)

In [None]:
abc

A particular example of an ordering function would be:

In [None]:
sentence = "I am used to writing flamboyant sentences"
sentence = sentence.split()
sentence

In [None]:
def string_length(s):
    return len(s)

In [None]:
sentence.sort(  key = string_length  )

In [None]:
sentence

#### 0.2.2 Dictionary methods

Let's see some useful dictionary methods.

In [None]:
myDictionary

##### The `get()` method returns the value of a specified key

If the key is not found,`get()` will not raise an error, instead it will return `None` or a specified value.

In [None]:
myDictionary.get('pear')

In [None]:
myDictionary.get('mango')

In [None]:
myDictionary.get('mango', 0)

##### The `pop()` method removes an item from the dictionary, returning its value.

This method works in a similar way to the list `pop()` method. If the key is not found, it will raise an error by default. If specified, it will return a value.

In [None]:
myDictionary.pop('apple')

In [None]:
myDictionary

In [None]:
myDictionary.pop('mango')

In [None]:
myDictionary.pop('mango', "fruit not found")

### 0.2.3 Anonymous functions


Anonymous functions, or lambda expressions are functions that do not have a name, and are usually created on the go.

The syntax would be:

`lambda x1, x2, x3, ..., xn : (some action on x1, x2, x3, ..., xn)`

Lambda expressions can take any number of inputs (including zero).

Let's see some simple examples.

In [None]:
f = lambda x: 10*x + 1

In [None]:
f(3)

In [None]:
f(4)

In [None]:
c = lambda x, y: x**2 + y**2

In [None]:
c(1,1)

In one of the `sort()` method examples, we could have used a lambda expression to make our code more concise. 

Our previous code looked like:

In [None]:
sentence = "I am used to writing flamboyant sentences"
sentence = sentence.split()
sentence

In [None]:
def string_length(s):
    return len(s)

In [None]:
sentence.sort(  key = string_length  )

In [None]:
sentence

We want to sort the word list according to the length of the word. We can use a lambda expression as a key in in the `sort()` method.

In [None]:
sentence = "I am used to writing flamboyant sentences"
sentence = sentence.split()
sentence

In [None]:
sentence.sort(  key = lambda s: len(s) ) 

In [None]:
sentence

### 0.2.4 Exceptions


Sometimes our code can encounter an error that makes our program to stop. Exceptions are a way to capture errors and perform actions accordingly.


Let's see a couple of examples:

In [None]:
myDictionary

In [None]:
myDictionary["mango"]

When trying to acess a value with a non-existant key, python raises a `KeyError`. We'll handle this error by adding a key to the dictionary with a value of 0.

In [None]:
try:
    myDictionary["mango"]
except KeyError:
    myDictionary["mango"] = 0

In [None]:
myDictionary

Earlier today, we used the `modify_list()` function to re-assign the third value of a list. But what if the list's length is less than 3?

In [None]:
def modify_list( l , v ):
    l[2] = v

In [None]:
short_list = [1,2]

In [None]:
modify_list( short_list , 3 )

The code raises an `IndexError`. We'll modify our function to take into account this posibility.

In [None]:
def modify_list( l , v ):
    try:
        l[2] = v
    except IndexError:
        print("ERROR: List length should be at least 3!")

In [None]:
modify_list( short_list , 3 )

In [None]:
short_list

-----------------------------------

## 1 Files I/O





### 1.0 Reading files


A particular syntax for reading files in python would be:

In [None]:
with open( "file.txt", 'r' ) as file:
    #some set of actions

The previous code is just a template, so it will return an error if you attempt to run it because it expects code after the colon. 

Let's dissect the code:


* The `with` statement.. ..
* The `open()` function takes the filename and the action as input. In this case `'r'` means **r**ead
* the **`as`** ` file` assigns the input file to a variable called in this case, `file`


If we want to print each file's line:

In [None]:
with open( "file.txt", 'r' ) as file:
    for i in file:
        print(i)

* The `next()` function lets you skip one line of the file at a time. The syntax would be `next(file)`. Skip the header implementing `next()`.

### 1.1 Writing into files

The syntax for writing into files is similar to reading files. We use the `write()` method to write into files.


Syntax:

`file.write(string)`

We want to write a famous haiku into a file called `haiku.txt`:

*old pond*


*frog leaps in*


*water's sound*


In [None]:
haiku = "old pond\nfrog leaps in\nwater's sound"

with open("haiku.txt", 'w' ) as file:
    
    file.write(haiku)

Success! But what if the string was given to us as a list?

In [None]:
haiku = "old pond\nfrog leaps in\nwater's sound".split("\n")
haiku

In [None]:
with open("haiku.txt", 'w' ) as file:
    
    file.write(haiku)

The `write()` method only takes strings as input, so we have to modify our code.

In [None]:
with open("haiku.txt", 'w' ) as file:
    
    for i in haiku:
        
        file.write(i + "\n")
        

* Check the resulting file. Something's wrong. It looks like each item on the list is concatenated without any space between them. This is because the `write()` method keeps writing in a line unless you make it write a newline character. Modify the code to write a correct `haiku.txt` file.

The `'w'` option tells the `open()` function to **overwrite** an existing file, so be careful! If we want to keep writing into the same file we would use the `'a'` option (as in **a**ppend). Let's append the original japanese version into the `haiku.txt` file.

In [None]:
haiku_original = "furu ike ya \nkawazu tobikomu \nmizu no oto".split("\n")

In [None]:
with open("haiku.txt", 'a' ) as file:
    
    file.write("-----------\n")
    file.write("The original japanese: \n")
    for i in haiku_original:
        file.write( i + "\n" )
        

### 1.2 Combining reading and writing


Let's use a more realistic example to review what we know so far. We have a genomic gtf file, and we would like to generate a gtf file whose entries only belong to chromosome 11.

We need to write a script:

**input:** genomic gtf file


**output:** chromosome 11 gtf file



In [None]:
inFile = "bos_taurus.gtf"
outFile = "bos_taurus_ch11.gtf"

In [None]:
with open( inFile, 'r' ) as genome:                                       # open genome gtf (read mode)
       
    with open(outFile, 'w' ) as chromosome_11:                            # open chr11 gtf (write mode)
        
        for line in genome:                                               # for each line in genome file
            
            if line.startswith("#"):                                      # if line begins with comment character #
            
                pass                                                      # do nothing (skip line)
            
            else:                                                         # if it is an entry line
                
                g_list = line.split()                                     # read contents into a list
                
                if g_list[0] == "chr11":                                  # if the first element (chromosome) is chr11
                        
                        chromosome_11.write( line )                       # write it to the new file
                        
        

There are many ways to read and write files into python. The choice really depends on your file formats and memory efficiency.

-----------------------------

## 2 Useful modules


Python modules are imported in the following way:


`import [module]`

If we want to only load a particular method or class from a module, we can use:


`from [module] import [method]`

### 2.0 `sys` module

When writing a script, we also want to make it possible to give arguments to the script. The `sys` module lets us do that.


We want to create a script that prints numbers from 1 to a specified value. Running this script in the terminal should look like:

`python3 script.py <maximum>`


In [None]:
import sys  # First, we import the sys module

The `sys` module provides a list of all the argument values passed onto the script; `sys.argv`.

The first element of the list `sys.argv[0]` contains the name of the script, and the following items contain the arguments that have been passed into the script.


Our script would look like:

In [None]:
import sys

maximum = sys.argv[1]


for i in range( 1, maximum  + 1 ):
    print(i)



### 2.1 `math` and `numpy` modules

These modules are necessary for a more mathematical use of python. 

* `math` contains plenty of mathematical functions (real values).

* `numpy` contains methods and classes for linear algebra and random number generators.

### 2.2 `Biopython`

Biopython is a module that contains classes and methods for handling biological data. We will see an example usage down below.


Import it using:

`import Bio`


-----------------------------

## 3 Example scripts

In this section we will check some simple example scripts written in python for bioinformatics data wrangling. The code can be found in [github](https://github.com/gaxyz/utilidad).

### 3.0 One-line fasta using Biopython

The next script uses Biopython for converting a multiline fasta into a one-line fasta.

The terminal usage would be:

`fasta1linea.py <multiline.fa> > singleLine.fa`

In [None]:
#!/usr/bin/env python3
import sys
from Bio import SeqIO

inputfile = sys.argv[1]

seqdict = {}
for seq in SeqIO.parse( inputfile , "fasta" ):
    seqdict[seq.id] = seq.seq

for item in seqdict:

    print(">" + item )
    print( str( seqdict[item] ) )

### 3.1 Update coordinates of a plink map file

The next script takes two map files as input. It uses the second map file's coordinates as a reference for creating a map file identical to the first map file with updated coordinates.

The terminal usage would be:

`modify_map.py map_to_update.map reference.map updated.map`

In [None]:
#!/usr/bin/env python3

import sys

print( "\nRunning {0}...".format( sys.argv[0] ) )

oldMap = sys.argv[1]
newMap = sys.argv[2]

output = sys.argv[3]


d = {}
with open( newMap , 'r' ) as handle:


    for line in handle:
        
        chromosome, snpName, cM, pos = line.split()

        d[snpName] = [ chromosome, cM, pos ]


with open( oldMap , 'r' ) as old:
    with open( output, 'w' ) as out:

        modifiableChr = ["30", "31", "32", "33"]
        chrDict = {
                "30":"X",
                "31":"Y",
                "32":"30",
                "33":"MT"
                }

        for line in old: 
            old_chr, old_id, old_cm, old_pos = line.split()

            new_chr, new_cm, new_pos = d[old_id]
            new_cm = 0 # i dont need this for now
            if new_chr in modifiableChr:
                new_chr = chrDict[ new_chr  ]
                       


            out.write( "{0} {1} {2} {3}\n".format(new_chr, old_id, new_cm, new_pos  ) )

print( "--> Success...\n\n" )

-------------------------

## 4 Extras: python related stuff

### 4.0 Conda

### 4.1 Jupyter notebooks