***
Welcome! Strings are one of the most used object types in Python. 
<br>
<br>
They consist of sequences of characters (including numbers) that are declared **enclosing our characters in "" or ''**.
<br>
They are extremely important in Natural Language Processing as most of the things we do are based on text. Knowing Python strings is very important to be able to manipulate most operations around natural language.
<br>
<br>
A computer stores `strings` (or characters) as binary numbers - translating these characters to the binary format that a computer recognizes is called encoding.
<br>
<br>
Strings in Python are immutable, which means that once a string is created, it cannot be changed. However, we can create a new string by modifying or concatenating existing strings. Python provides many built-in functions and methods for manipulating strings, including splitting, joining, searching, and replacing.
<br>
<br>
Working with strings is a fundamental skill in Python programming, and it's essential for a wide range of applications, from web development to data analysis. With a solid understanding of Python strings, you'll be able to write more efficient and effective code. Let's dive deeper into the world of strings in Python!
***

# 1 - Python Strings

## 1.1 - Declaration and Storage

Let's declare a string first:

In [1]:
our_string = "Europe"

Out of curiosity, let's see the binary representation of this word:

In [2]:
bin(int.from_bytes(our_string.encode(), 'big'))

'0b10001010111010101110010011011110111000001100101'

If we change one character, only a part of our binary representation changes:

In [3]:
bin(int.from_bytes("Europa".encode(), 'big'))

'0b10001010111010101110010011011110111000001100001'

We can access different elements of our string:

In [4]:
# Accessing first element
our_string[0]

'E'

In [5]:
# Accessing second element
our_string[1]

'u'

In [6]:
# Accessing last element
our_string[-1]

'e'

In [7]:
# Accessing slices
our_string[1:3]

'ur'

In [8]:
# If we try to access elements outside of the range of 
# the string we get an error
our_string[10]

IndexError: string index out of range

**Strings are immutable, meaning that they cannot be changed after creating them, although they can be used to generate new objects.**

In [9]:
# To replace
our_string[4] = 'a'

TypeError: 'str' object does not support item assignment

## 1.2. - String Combination

String combination is pretty straightforward in Python using the `+` symbol:

In [11]:
str_1 = 'Good'
str_2 = 'Morning!'

In [12]:
str_1+str_2

'GoodMorning!'

We can also combine multiple strings together:

In [13]:
str_1+' '+str_2

'Good Morning!'

Replication is also possible using `*`:

In [14]:
str_1*4

'GoodGoodGoodGood'

## 1.3. - String Iteration

In [3]:
sentence = "This is our example sentence!"

We can iterate through each character in our string by using `for`:

In [16]:
for letter in sentence:
    print(letter)

T
h
i
s
 
i
s
 
o
u
r
 
e
x
a
m
p
l
e
 
s
e
n
t
e
n
c
e
!


And also do something based on a condition:

In [17]:
for letter in sentence:
    if letter == 'x':
        print('found an x!')

found an x!


## 1.4. - Testing

Let's see how we can test if some specific substring is in our `sentence`:

In [4]:
print(sentence)

This is our example sentence!


In [5]:
'sentence' in sentence

True

In [6]:
'banana' in sentence

False

## 1.5 - Escaping Characters

This will yield an error, although gramatically correct:

In [20]:
example_1 = 'This is someone's'

SyntaxError: invalid syntax (<ipython-input-20-21d44f649517>, line 1)

To solve this we can use an `escape character`:

In [21]:
example_1 = 'This is someone\'s'

In [22]:
example_1

"This is someone's"

Or.. enclose in different quotes:

In [1]:
example_2 = "This is someone's"

In [2]:
example_2

"This is someone's"

Sometimes there are escape characters or sequences inside strings that we want to keep as a string.
<br>
<br>
To read them we have to pass a raw string using `r` before declaring the string. 

In [1]:
path = 'C:\Ivo\Users'

SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 6-7: truncated \UXXXXXXXX escape (530308860.py, line 1)

To pass as a raw string we use `r'` before declaring the string:

In [26]:
path = r'C:\Ivo\Users'
print(path)

C:\Ivo\Users
