# Data Types

## 1. Numeric

There are three numeric types in Python:

- Int: Int, or integer, is a whole number, positive or negative, without decimals, of unlimited length.

- Float: Float, or "floating point number" is a number, positive or negative, containing one or more decimals. Float can also be scientific numbers with an "e" to indicate the power of 10.

- Complex: Complex numbers are written with a "j" as the imaginary part:

Variables of numeric types are created when you assign a value to them.

In [3]:
x = 1    # int
y = 2.8  # float
z = 1j   # complex

In [2]:
human_genes = 21306         # int type. Not 21,306 or 21 306
US_population = 328918373
print('Number of human genes:', human_genes)
print('Number of human genes in US:', human_genes*US_population)

Number of human genes: 21306
Number of human genes in US: 7007934855138


In [7]:
# Floats are implemented using double in C
exons_per_gene = 8.9        # float type
print('human exons per gene:', exons_per_gene)
human_exons = exons_per_gene*human_genes
print("human_exons =", human_exons)
print('Number of human exons:', human_exons)

human exons per gene: 8.9
human_exons = 189623.4
Number of human exons: 189623.4


In [8]:
# To convert float to integer
human_exons = int(human_exons)
print('Approximate number of human exons:', human_exons)

Approximate number of human exons: 189623


**convert from int to float:**
a = float(x)

**convert from float to int:**
b = int(y)

**convert from int to complex:**
c = complex(x)

print(a)
print(b)
print(c)

print(type(a))
print(type(b))
print(type(c))

In [13]:
# Can Python do arithmetic?
import math
firstProduct = (9.4*0.2321)*5.6
secondProduct = 9.4*(0.2321*5.6)
print('(9.4*0.2321)*5.6 - 9.4*(0.2321*5.6) =',
      (firstProduct - secondProduct))

(9.4*0.2321)*5.6 - 9.4*(0.2321*5.6) = -1.7763568394002505e-15


In [16]:
# To access mathematical functions from math module
two_pi = 2.0*math.pi
print('two_pi =', two_pi)
print('sin(two_pi) =', math.sin(two_pi))
print('Do you believe this result?')
#Mathematically, sin(2ùúã)= 0 but due to floating-point precision limits in computers, Python returns a very small number close to zero

two_pi = 6.283185307179586
sin(two_pi) = -2.4492935982947064e-16
Do you believe this result?


---
## 2. Strings
Strings in python are surrounded by either single quotation marks, or double quotation marks.

'hello' is the same as "hello".

In [18]:
protein = "GFP"                    # Winner of 2008 Nobel in chemistry
protein_seq_begin = 'MSKGEELFTG'
protein_seq_end = 'HGMDELYK'

# Concatenation of strings
protein_seq = protein_seq_begin + '...' + protein_seq_end
print('Protein sequence of GFP: ' + protein_seq)

Protein sequence of GFP: MSKGEELFTG...HGMDELYK


DNA_seq = DNA_seq.upper()

- The **.upper() string method** converts all lowercase letters in the string to uppercase.

Strings in Python are immutable (they can‚Äôt be changed in place), so .upper() returns a new string, that is reassigned back to DNA_seq.

In [20]:
# String method str.upper()
DNA_seq = 'atgagtaaag...actatacaaa'
DNA_seq = DNA_seq.upper()
print('DNA sequence: ' + DNA_seq)

DNA sequence: ATGAGTAAAG...ACTATACAAA


In [21]:
# Forward index starts with 0 and increases
# Backward index starts with -1 and decreases
print('The second nucleotide:', DNA_seq[1])
print('The last nucleotide:', DNA_seq[-1])

The second nucleotide: T
The last nucleotide: A


In [25]:
# Slicing a string
first_codon = DNA_seq[0:3]         # index 3 excluded
last_codon = DNA_seq[-3:]
print("first codon:", first_codon)
print('Last codon:', last_codon)

first codon: ATG
Last codon: AAA


---
### 3. List
Lists are used to store multiple items in a single variable.

Lists are one of 4 built-in data types in Python used to store collections of data, the other 3 are Tuple, Set, and Dictionary, all with different qualities and usage.

Lists are created using square brackets:

In [27]:
# Constructing a list
stop_codons = ['TAA', 'tAG']
print(stop_codons)

['TAA', 'tAG']


In [29]:
 # Accessing an item in a list
first_stop_codon = stop_codons[0]
print(first_stop_codon)

TAA


In [30]:
# Modifying an item in a list
stop_codons[1] = 'TAG'
print(stop_codons)

['TAA', 'TAG']


In [31]:
 # Appending an item to the end of a list
stop_codons.append('TGA')
print(stop_codons)

['TAA', 'TAG', 'TGA']


**stop_codons.append('TGA')**

The .append() method adds a new element to the end of the list.  
Note: .append() modifies the original list in place, it doesn‚Äôt create a new list.

In [32]:
# Number of items in a list
number_of_stop_codons = len(stop_codons)
print('There are', number_of_stop_codons, 'stop codons')

There are 3 stop codons


**len** for length of a list

In [32]:
# Number of items in a list
number_of_stop_codons = len(stop_codons)
print('There are', number_of_stop_codons, 'stop codons')

There are 3 stop codons


In [36]:
# Convert list to a string
DNA_seq = ''.join(stop_codons)
print(DNA_seq)

TAATAGTGA


The join() method combines all elements of a list into a single string.

The string before .join() ( ' ' in this case) is the separator placed between each element.

Since the separator is an empty string (''), all list elements are joined without spaces or characters between them.

In [37]:
# Convert string to a list
DNA_list = list(DNA_seq)
print(DNA_list)

['T', 'A', 'A', 'T', 'A', 'G', 'T', 'G', 'A']


In [38]:
# Slicing a list
second_codon = DNA_list[3:6]             # index 6 not included
print('Second codon:', second_codon)

Second codon: ['T', 'A', 'G']


In [39]:
# Copying a list
DNA_list_duplicate = DNA_list.copy()
print(DNA_list_duplicate)

['T', 'A', 'A', 'T', 'A', 'G', 'T', 'G', 'A']


In [41]:
# Insert, delete element
DNA_list_duplicate.insert(5, "?") #he "?" is inserted before the element that was originally at index 5.
print(DNA_list_duplicate) 

DNA_list_duplicate.pop(5)        # Can also use: del DNA_list_duplicate[5]. The .pop here removes the element at index 5 (which is "?").
print(DNA_list_duplicate)

['T', 'A', 'A', 'T', 'A', '?', 'G', 'T', 'G', 'A']
['T', 'A', 'A', 'T', 'A', 'G', 'T', 'G', 'A']


**list.pop(index)** or **del** :If no index is given, it removes the last element by default.

---
## Tuple