## Strings: PART I
An ordered collection of characters to store and represent text and bytes based information.

Strings can be used to represent just about anything that can be encoded as text or bytes.

Strings can also be used to hold the raw bytes used for media files and network transfers and both the encoded and decoded forms of non ASCII Unicode text used in internationalized programs.

## Common String literals and operations
1. `""` : Empty string
2. `"Spam's" : Double quotes same as single
3. `'s\np\ta\x00m'` : Escape sequence
4. `"""...Multiline..."""` : Triple quoted block strings
5. `r'\temp\spam'` : Raw strings (no escape)
6. `b'sp\xc4m'` : Byte strings
7. `S1 + S2` : Concatenate
8. `u'sp\u00c4m'` : Unicode strings
9. `S * 3` : Repeat three time
10. `S[i]` : Indexing
11. `S[i:j]` : Slicing
12. `len(S)` : Length
13. `"a {0} parrot".format(kind)` : String formatting
14. `"a %s parrot" % kind`
15. `S.find('pa')` : String method
16. `S.rstrip()` : Remove whitespace
17. `S.replace('pa', 'io')` : Replacement
18. `S.split(',')` : Split on delimeter
19. `S.isdigit()` : Content test
20. `S.lower()`: Case conversion
21. `S.endswith('spam')` : End test
22. `'spam'.join(strlist)` : Delimeter join
23. `S.encode('latin-1')` : Unicode encoding
24. `B.decode('utf-u')` : Unicode decoding
25. `for x in S: print(x)` : Iteration
26. `'spam' in S` : Membership
27. `[c * 2 for c in S]` : Comphrehension
28. `map(ord, s)`
29. `re.match('sp(.*)am', line)` : Pattern matching
30. `\newline` : Ignored (continuation line)
31. `\\` : Backslash (stores one \\)
32. `\'` : Single quote(stores ')
33. `\"` : Double quote(store ")
34. `\a`: Bell
35. `\b`: Backspace
36. `\f` : Formfeed
37. `\n` : Newline (linefeed)
38. `\r` : Carriage return
39. `\t` : Horizontal tab
40. `\v` : Vertical tab
41. `\xhh` : Character with hex value hh (exactly 2 digits)
42. `\ooo` : Character with octal value ooo (upto 3 digits)
43. `\0` : Null binary 0 character (doesnt end string)
44. `\N{ id }`:Unicode database ID
45. `\uhhh` : Unicode character with 16 bit hex value
46. `\Uhhhhhhh`: Unicode character with 32 bit hex value
99. 99989796


In [5]:
# myfile = open('C:\new\text.dat', 'w') # C:(newling)ew(tab)ext.dat

# myfile = open('C:\\new\\text.dat', 'w')

In [11]:
long_string = """
Always look on the
bright side of the life.
"""
string = 'Always look on the\nbright side of the life.'

print(string)
print(long_string)

Always look on the
bright side of the life.

Always look on the
bright side of the life.



In [12]:
# Using parentheses to not include comment string if we want

long_string = """Hi # this is comment
I am a student # This is also a comment
"""
string = (
    "Hi\n" # this is comment
    "I am a student" # This is also a comment
)

print(long_string)
print(string)

Hi # this is comment
I am a student # This is also a comment

Hi
I am a student


## Strings in Action

### Basic Operations

In [16]:
len("Hello"), 'abc' + 'def', 'abc' * 3

(5, 'abcdef', 'abcabcabc')

In [25]:
print("-" * 20) # printing easy way

--------------------


In [29]:
myjob = "hacker"

for c in myjob: print(c, end=" ")
# step through items and print
print("\n")
for a in 'SUCCESSFUL': print(a*2, end=" + ")

h a c k e r 

SS + UU + CC + CC + EE + SS + SS + FF + UU + LL + 

In [31]:
"easy" in "Engineering", "possible" in "impossible"

(False, True)

### Indexing and Slicing

In [46]:
S = "spam"
S[0], S[-2] # first and second last character fetch by this indexing

# S[x] x is positive then start from left to right starting from the index 0
# S[x] x is negative then start from right to left starting from the index -1
for i in range(len(S), 0, -1): print(S[i-1], end=" ")

m a p s 

#### Position
[0 , 1 , 2 , ........, -2, -1]

In [42]:
a = "I am a student"
a[0:4], a[5:7], a[7:15], a[:-1]

# a[x:y] x is included and y is not included
# a[:-1] start from 0 and end at -1 but -1 is not included

('I am', 'a ', 'student', 'I am a studen')

In [48]:
string = "Hello"

string[1:3] # fetches items at offset 1 upto but not including 3
string[1:] # fetches items at offset 1 to the end
string[:3] # fetches items at offset 0 upto but not including 3
string[:-1] # fetches items at offset 0 upto but not including -1
string[:] # fetches all items in the sequence

string[::2] # with stride step of 2 'H l 0'

'Hlo'

In [54]:
string = 'abcdefghijklmnopqrstuvwxyz'

string[1:10:2] # from offset 1 to 10(not including 10) with stride of 2 'b d f h j'
string[::3] # from offset 0 to end with stride of 3 'adgjmpsvy'

'adgjmpsvy'

In [56]:
string[slice(1,10)] # same as string[1:10]
string[slice(None, None, 3)] # same as string[::3] slicing

'adgjmpsvy'

#### String Conversion Tools

In [61]:
# "20" + 1 # TypeError: can only concatenate str (not "int") to str

int("20"), str(20) # convert to integer from string and vice vers

(20, '20')

In [63]:
print(str("spam"), repr("spam"))
# repr() is used to convert to string but returns the object as a string of code that can be re run to recreate the object

spam 'spam'


In [65]:
str("spam"), repr("spam"), print(repr("spam"))

'spam'


('spam', "'spam'", None)

#### Character conversions code:

In [71]:
ord("m") # get the ASCII value : 109 is m
chr(109) # get the character from ASCII value : m

a = 'o'
s = chr(ord(a) + 1) # convert to ASCII value and then add 1 and then convert to character
print(s)

p


In [80]:
int('5') # 5
ord('1') - ord('0') # 5

1

In [84]:
# Converting binary digits to integer with ord()
binary = '0011'
i = 0

while binary != "":
    i = i * 2 + (ord(binary[0]) - ord('0'))
    binary = binary[1:]
    
# itr1: i = 0 * 2 + (0 - 0) = 0
# itr2: i = 0 * 2 + (0 - 0) = 0
# itr3: i = 0 * 2 + (1 - 0) = 1
# itr4: i = 1 * 2 + (1 - 0) = 3
i

3

In [85]:
string = "impossible"
string.replace("im", "")

'possible'

In [86]:
"I have %d %s with me currently" % (5, "apple")

'I have 5 apple with me currently'

In [87]:
"I have {0} {1} with me currently".format(5, "banana")

'I have 5 banana with me currently'