## Strings
Any text can be a string  
* "a"
* "I'm a string!"
* "0.5"

In python strings should be enclosed in quotes, double quotes or their triplication
* 'd'
* "42"
* """Massive string"""
* '''Another massive string'''

In [1]:
"a"

'a'

In [2]:
'string'

'string'

In [3]:
"""multiline
string"""

'multiline\nstring'

In [4]:
"jvd
vfd
vfdkm"

SyntaxError: EOL while scanning string literal (<ipython-input-4-703b84314c3a>, line 1)

## Print strings
`print()` is a function to pass something to display as a string  
Something should be placed in parenthesis

In [5]:
'Hello, guys!'

'Hello, guys!'

In [6]:
print('Hello, guys!')

Hello, guys!


## How to print quote inside string?
How would you print this phrase?  
There are several types of quotes in language - '', "" and even their multiplication - """""", ''''''

In [9]:
print("There are several types of quotes in language - '', "" and even their multiplication - """""", '''''")

SyntaxError: EOF while scanning triple-quoted string literal (<ipython-input-9-aa62a2043678>, line 1)

Escape character is really for such case - \  
Backslash is a special character which disable meaning of other special characters (quotes in this case)

## Another function to look at strings
`repr()` will show how something is represented as string internally  
Again something should be placed inside parenthesis (it is common for functions)

In [12]:
repr('a')

"'a'"

In [13]:
repr(3)

'3'

In [14]:
print("There are several quotes in this sentence - \'\', \"\" and a backslash itself \\")

There are several quotes in this sentence - '', "" and a backslash itself \


In [17]:
repr("There are several quotes in this sentence - \'\', \"\" and a backslash itself \\")

'\'There are several quotes in this sentence - \\\'\\\', "" and a backslash itself \\\\\''

In [16]:
"There are several types of quotes in language - '', \"\" and even their multiplication - \"\"\"\"\"\", \'\'\'\'\'\'"

'There are several types of quotes in language - \'\', "" and even their multiplication - """""", \'\'\'\'\'\''

# Strings deeper

Very important data type, especially for bioinformaticians
* "The gray fox jump over the lazy dog" - text
* "ATGTGTCGTGATGCGTTG" - DNA
* "qwerty@server.domain" - service strings

## String representation
Strings are made of characters and each character is reepresented as a numeric code internally. One of the simplest representation is ASCII

* `chr()` - get character represented by this code
* `ord()` - get numeric code of character

In [2]:
chr(67)

'C'

In [3]:
ord('a')

97

## Creation
As you already know 
* `''`, `""`, triple variants - direct text
* `str()` - constructor to make strings from other objects

In [2]:
"My STRING"

'My STRING'

In [1]:
str([1, 2, 3])

'[1, 2, 3]'

## Operations with strings
There are 2 allowed operations for strings
* `+` - concatenate 2 strings
* `*` - multiply string - concatenate string with itself n times

In [8]:
'Hello' + 'World'

'HelloWorld'

In [12]:
'Hello' + ' ' + 'World'

'Hello World'

In [13]:
'Hi!' * 3

'Hi!Hi!Hi!'

In [74]:
3 * 'Hi!'

'Hi!Hi!Hi!'

## String methods
Strings have quite a big number of useful methods. Strings are immutable iterable object
* General purpose methods
    * `index(substring, [begin, end])` - find 1st start of substring in string starting from begin to end; begin and end are 0 and end index of string by default
    * `count(substring, [begin, end])` - count non-overlapping occurences of substring in string starting from begin to end; begin and end are 0 and end index of string by default

In [67]:
'Hi everyone here!'.index('eve')

3

In [72]:
'Hi everyone here!'.index('and me')

ValueError: substring not found

In [71]:
'Hi everyone here!'.count('er')

2

In [73]:
'There is only light'.count('darkness')

0

* String representation
    * `upper()` - make all characters UPPER CASE
    * `lower()` -  MAKE ALL CHARACTERS lower case
    * `title()` - Make All Characters Title
    * `swapcase()` - mAKE aLL cHARACTERS tITLE
    * `capitalize()` - Make 1st character upper and other lowercase

In [12]:
'Make All Characters swapped'.capitalize()

'Make all characters swapped'

In [6]:
'AWESOME natural NuMbEr - 2.71828'.title()

'Awesome Natural Number - 2.71828'

In [7]:
'atgtcgtgtcgtgtcgtaatgagtctatatatatat'.upper()

'ATGTCGTGTCGTGTCGTAATGAGTCTATATATATAT'

In [8]:
'E.Mail@gmail.com'.lower()

'e.mail@gmail.com'

In [10]:
'E.Mail@gmail.com'.swapcase()

'e.mAIL@GMAIL.COM'

* Determine type of character
    * `isalpha()` - whether string contains only letters
    * `isdigit()` - whether string contains only digits
    * `isalnum()` - whether string contains only digits and letters
    * `isupper()` - whether letters in string only upper
    * `islower()` - whether letters in string only lower
    * `isspace()` - whether string contains only whitespace characters
    * `startswith(substring)` - whether substring is a start of string
    * `endswith(substring)` - whether substring is an end of string

In [16]:
'abc'.isalpha()

True

In [17]:
''.isalpha()

False

In [18]:
'12'.isdigit()

True

In [19]:
'1'.isdigit()

True

In [33]:
'Aa'.isupper()

False

In [34]:
'AAA'.isupper()

True

In [35]:
'123A'.isupper()

True

In [32]:
' \t \n'.isspace()

True

In [1]:
'And I\'m in combat!'.startswith('A')

True

In [2]:
'Cause every hour in my head'.startswith('Cau')

True

In [3]:
'Sigh'.startswith('s')

False

In [6]:
'Is it true'.endswith('e')

True

* String transformation
    * `replace(old, new, n)` - replace each old substring in string with new one n times; replace every substring by default, non-overlapping
    * `join(iterable)` - create string from elements in iterable and interleave them with string; elements in iterable should be str for this method
    * `maketrans(original, new)` and `translate(table)` - methods to translate characters

In [76]:
'reverse transcription'.replace('e', 'i')

'rivirsi transcription'

In [77]:
'reverse transcription'.replace('e', 'i', 1)

'riverse transcription'

In [47]:
# Non overlapping
'ATATATGTCG'.replace('ATA', 'TUT')

'TUTTATGTCG'

In [75]:
'The gray fox jump over the lazy dog'.replace('the', 'not')

'The gray fox jump over not lazy dog'

In [48]:
', '.join(('a', 'b', 'c', 'd'))

'a, b, c, d'

In [51]:
'*'.join({1, 2, 'c', True, '4.65'})

TypeError: sequence item 1: expected str instance, int found

In [2]:
'*'.join({str(1), str(2), 'c', str(True), '4.65'})

'1*c*True*2*4.65'

In [24]:
# Reverse TRANSCRIPTION
'AUGUGCGUGA'.translate(str.maketrans('AUGC', 'TACG'))

'TACACGCACT'

* Useful methods for string processing
    * `strip()` - get rid of leading and trailing spaces
    * `split(separator, n)` - convert string to a list of its parts - split it by separator (whitespaces by default); n is equal to number of separator in string by default
    * `splitlines()` - nice method to split text by lines

In [12]:
'   Inconsistency in spaces is part of originality. \n'.strip()

'Inconsistency in spaces is part of originality.'

In [9]:
'there is no hope'.strip('therp ')

'is no ho'

In [29]:
'There is no faith'.split()

['There', 'is', 'no', 'faith']

In [30]:
'There \nis no  \t faith'.split()

['There', 'is', 'no', 'faith']

In [41]:
print('I am\na\nfucking\r\ntext\n')

I am
a
fucking
text



In [43]:
# By every whitespace character
'I am\na\nfucking\r\ntext\n'.split()

['I', 'am', 'a', 'fucking', 'text']

In [46]:
# By UNIX newline character
'I am\na\nfucking\r\ntext\n'.split('\n')

['I am', 'a', 'fucking\r', 'text', '']

In [47]:
# By every newline character
'I am\na\nfucking\r\ntext\n'.splitlines()

['I am', 'a', 'fucking', 'text']