## String behaviour in Python 3
In Python 3, a string is a collection of characters that are __unicode objects__.

In Python 2, a string is a collection of characters that are __bytes__. 

In [15]:
string = "caf\xc3\xa9".encode("utf-8")
# encode returns bytes
print(string)
string_unicode = string.decode("utf-8")
# decode returns string
print(string_unicode)

b'caf\xc3\x83\xc2\xa9'
cafÃ©


## Bytes and Bytearray

Bytes are numbers taking value from 0 to 255. Bytes are immutable, while bytearray are mutable

In [45]:
cafe = bytes("caf\xc3\xc9", encoding="utf-8")
print(cafe)
print(cafe[0])
print(cafe.decode("utf-8"))
cafe_arr = bytearray(cafe)
print(cafe_arr)
print(cafe_arr[3])

b'caf\xc3\x83\xc3\x89'
99
cafÃÉ
bytearray(b'caf\xc3\x83\xc3\x89')
195


In [50]:
print(cafe.decode("utf-8"))

cafÃÉ


### Errors can be fixed by:
1. Ignoring: Python will simply skip the places where the ordinal is not in range, and those characters will be skipped. 
3. Replace: Python will replace problematic chars with "?"

In [48]:
print(cafe.decode("ascii", errors="ignore"))
print(cafe.decode("ascii", errors="replace"))

caf
caf����


## Best Practises while Text processig
Always specify the encoding while opening a file because the default encoding depends on OS and your program might not work well across platforms.  
```
with open(file_name, 'w', encoding="utf-8") as f:
    ...do stuff...
```