# Introduction to Python - Session 1
1. Programming. What is Python?
2. What are Jupyter Notebooks?
3. Paths and directories
4. Creating a python script/notebook
5. Basics of Python: objects, syntax, functions
6. Data types
    - Numbers
    - Lists
    - Strings
    - Dictionaries.

SLIDES [HERE](https://docs.google.com/presentation/d/1fOlsjAHyX7_E5tS699iphh-JqjO9Ghy8OsE-ChgBYKE/export/pdf)

**Getting help**
The built-in function help() will show you interactive documentation about most Python objects.

In [66]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



## EXERCISE 1 - Getting started
Exercises are explained in this notebook. Use the cells below each question to write and run your own answers. Remember, you can add text cells as you wish and use # for commenting the code.

In [1]:
# This is a comment
print("Hello world")

Hello world


**1. Get familiar with the notebook environment. What is the working directory?**

The working directory is the place where this "Session1.ipynb" is located. For example, at the CRG, it might be "users/[group]/[user]/Python_Course/Session1". Or, using the abbreviation of ~ for the home directory, we could also express it as "~/Python_Course/Session1".

**2. Using Python, calculate the percentage of males and females currently present in the training.**

In [6]:
# 6 males out of 19 students:
print((6/19) * 100)

# 13 females out of 19 students:
print((13/19) * 100)

31.57894736842105
68.42105263157895


**3. Create a new object `myobject` with value 60. Print `myobject` in the console.**

In [7]:
myobject = 60
print(myobject)

60


**4. Reassign `myobject` with value 87.**

In [14]:
myobject = 87
print(myobject)

87


**5. Subtract 1 to `myobject` and reassign.**

In [15]:
# Option 1
myobject = myobject - 1
# Option 2 (use += and -= to reassign an object directly after adding/subtracting)
myobject -= 1

## EXERCISE 2 - Data types: strings

**1. Create the string `mystring` with value "Hello world!", and print it.**

In [8]:
mystring = "Hello world!"
print(mystring)

Hello world!


**2. Print "This is my string: Hello world!".**

In [76]:
print("This is my string: %s" % (mystring))

This is my string: Hello world!


**3. How long is the string?**

In [78]:
len(mystring)

12

**4. Get only the word "world" from `mystring` using numeric indexes.**

In [80]:
mystring[6:11]

'world'

**5. Split `mystring` word by word. Use the method `split`. Then `join` the words with slashes.**

In [11]:
splitted = mystring.split(" ")
"-".join(splitted)

'Hello-world!'

**6. Print `mystring` in upper case. Use the method `upper`.**

In [95]:
mystring.upper()

'HELLO WORLD!'

**7. Write a Python program that finds the position of the first cytosine (C) in the sequence "ATGTCACCGTTT".**

In [7]:
sequence = "ATGTCACCGTTT"
firstC = sequence.find("C")
print("The first cytosine is in position %i" % (firstC + 1)) # Remember that Python is 0-based, so we need to add 1

The first cytosine is in position 5


## EXERCISE 3 - Data types: numbers

**1. Create three variables `a`, `b`, and `c` with values 2, 5 and 3 respectively.**

In [83]:
# Option 1
a,b,c = 2,5,3
# Option 2
a = 2
b = 5
c = 3

**2. Is `a` greater or equal than `c`. And `b`?**

In [88]:
print(a>=c)
print(b>=c)

False
True


**3. Try `(a+c)*5`. Assign it to `d`.**

In [86]:
d = (a+c)*5
d

25

**4. Try b\*\*a. Is it the same as `d`? Use either `==` or `is` for comparison.**

In [90]:
(b**a) is d

True

**5. What is `d` divided by `a`? If not exact, what is the remainder?**

In [89]:
print(d/a)
print(d%a)

12.5
1


## EXERCISE 4 - Data types: lists

**1. Create a list `y` which contains the numbers from 2 to 11, both included. 
Print `y` in the console.**

In [70]:
# Option 1
y = [2,3,4,5,6,7,8,9,10,11]
print(y)

# Option 2 (using function range). Note that the end value of range is not included
y = list(range(2,12))
print(y)

[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]


**2. How many elements are in `y`? I.e what is the length of the list `y`?**

In [39]:
len(y)

10

**3. Show the 2nd element of `y`. Remember that Python uses 0-based indexing.**

In [68]:
y[1]

[3, 4]

**3. Show from 5th to 7th element of `y`.**

In [71]:
y[4:7] # Note that the end value of index is not included

[6, 7, 8, 9]

**4. Show the last element of `y`.**

In [27]:
y[-1]

11

**5. Remove the 4th element of `y` and reassign. Use the methods `pop`and `remove`. What is now the length of `y`?**

In [40]:
# Option 1 - by index
y.pop(3)
# Option 2 -  by value
y.remove(5)
len(y)

9

**6. What are the minimum, maximum and sum of values in `y`?**

In [59]:
print(max(y))
print(min(y))
print(sum(y))

11
2
60


**7. Check whether 1 and 9 are present in `y` list. Use the `in` operator.**

In [47]:
print(1 in y)
print(9 in y)

False
True


**8. Create a list `x=[1,2,3,1,2,3,1,2,3]`, but expressed as a repetion of `[1,2,3]`. Use `*`.**

In [63]:
x=[1,2,3]*3
x

[1, 2, 3, 1, 2, 3, 1, 2, 3]

**9. Add an additional element `15` in the list `x`. Use `append`.**

In [64]:
x.append(15)
print(x)

[1, 2, 3, 1, 2, 3, 1, 2, 3, 15]


**10. Add four additional elements `[45,72,4,6]` in the list `x`. Use `extend`.**

In [65]:
x.extend([45,72,4,6])
print(x)

[1, 2, 3, 1, 2, 3, 1, 2, 3, 15, 45, 72, 4, 6]


**11. Order the elements of `x` using the method `sort`.**

In [92]:
x.sort()
x

[1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 6, 15, 45, 72]

**12. Remove duplicated numbers in the list using the function `set`.**

In [96]:
print(set(x))

{1, 2, 3, 4, 6, 72, 45, 15}


**13. Given the protein sequence "MPISEPTFFEIF", split the sequence into its component amino acid codes and count how many phenilalanines (F) are there.**

In [5]:
protein = list("MPISEPTFFEIF")
protein.count("F")

3

## EXERCISE 5 - Data types: dictionaries

**1. Create a dictionary `mydict` that matches the one letter amino acid code `A`, `C`, `D` and `E` to the three letter codes `Ala`, `Cys`, `Asp`, `Glu`.**

In [1]:
mydict = {'A': 'Ala', 'C': 'Cys', 'D': 'Asp', 'E': 'Glu'}

**2. Print the keys and values of the dictionary. Use the methods `values` and `keys`. Print the three letter code of `C`.**

In [2]:
print(mydict.keys())
print(mydict.values())
print(mydict['C'])

dict_keys(['A', 'C', 'D', 'E'])
dict_values(['Ala', 'Cys', 'Asp', 'Glu'])
Cys


**3. Add phenilalanine to `mydict`.**

In [105]:
mydict['F'] = 'Phe'

**4. Add a fake amino acid `A` that matches to `Fake` value. Print `mydict`.**

In [106]:
mydict['A'] = 'Fake'
print(mydict) # Note that we have renamed the previous alanine. No duplicated keys are allowed.

{'A': 'Fake', 'C': 'Cys', 'D': 'Asp', 'E': 'Glu', 'F': 'Phe'}
