# Strings in Python

In this lecture we will discuss strings and _str_ type in Python, **you will learn**:

 - **What are strings in Python** 
 - **String basics**
 - **String indexing** 
 - **Slicing strings**
 - **Striding strings**
 
## What are strings in Python

Strings in Python are _**sequences of characters used to represent text information**_. A character is simply a symbol. For example, the English language has 26 characters. Strings are represented using the _str_ type. Here are few facts about strings:

 1. Strings can be delimited by single or double quotes, as long as the same kind is used at both ends.
 
 2. An empty string is simply a string with nothing between the delimiters.
 
 3. In Python a character is a string of length 1. 

## String basics

To create a string object in Python, simply put the text you want within single or double quotes.

In [2]:
s = 'hello world!'   #this is a string with single quote
s

'hello world!'

In [3]:
s = "hello world!"   #this is a string with double quotes
s

'hello world!'

### Strings are immutable 
    
In Python, objects of type str or any basic numeric types such as int or float are immutable $—$ that is, once the object is created and assigned a value, that value cannot be changed. We mention that because although we can use _**square brackets**_ to retrieve the character at a given index position in a string (_as we will see in the string indexing section_), we cannot use them to set a new character, see the example below and read the <font color="red"> **TypeError** </font>. 
    
In this example, we have created an object of type string with the value 'Cat' and made the object reference **animal** refer to it. Now we want to change the character 't' at position 2 to be character 'n', but Python doesn't allow that. The error says "**'str' object does not support item assignment**", this means we connot change the characters already assigned to that string object.


In [9]:
animal = 'Cat'
animal[2] = 'n'

TypeError: 'str' object does not support item assignment

However, we can create another string object and make **animal** refer to it. But the origianl object with value 'sport' is still unchanged.

In [8]:
animal = 'Cat'

### String length

As strings are sequences they are “sized” objects, therefore we can call len() function and pass a string as an argument to find the size (length) of that string. <br> 
The len() function will return the string length which is _the number of characters in the string_.

- len() returns zero for an empty string.

In [32]:
len('')               # an empty string is passed to len() function

0

In [33]:
len("Hello World")    # this string has 11 characters (including spaces)

11

In [34]:
st = 'I have homework' # this string has 15 characters (including spaces)
len(st)

15

**NOTE**: spaces are also considered as characters and are counted by len() function.

### String conversions

The built-in function str() in Python converts a data item such as an integer or float number to a string. 

Check out the following examples:

In [8]:
int("10")     # to covert a string '10' to an integer number 10

10

In [9]:
float("2.5")  # to convert a string '4.5' to a floating-point number 4.5

2.5

In [11]:
str(10)       # to covert an integer number 10 to string '10'

'10'

In [37]:
str()         # if nothing passed to str(), the function returns an empty string

''

In [10]:
str('Python') # if a string is passed to str(), the function returns a copy of that string

'Python'

### GENERAL RULE: 

To covert a data item from one type to another, use this syntax


                                  datatype(item)


In [13]:
print("Hello! this is the first line \n then the second line")

'Hello! this is the first line \n then the second line'

Or to add some **tabs** to your string, use \t

In [43]:
print ("Hello! this is the first sentence \t then the second sentence")

Hello! this is the first sentence 	 then the second sentence


## String indexing 
 
Python uses square brackets [ ] to access an item in a sequence such as a string. The square brackets [ ] are also called the **_access operator_**. Suppose we are working in the Python Shell, interactive interpreter or IDLE. We can enter the following:
 

In [29]:
"Hello There!"[4]

'o'

In [15]:
s = 'program'
s[0]

'p'

The square brackets syntax can be used with data items of any data type that is a sequence, such as strings and lists. **This consistency of syntax is one of the reasons that Python is so beautiful**. 

**NOTE**: the number between the square brackets is called an **index**, all Python index positions start at 0 and ends at the string length minus 1.

Let's see the first and the last index in the following example.

The method index() is called for the string _s_ just to show you the first and last index. Don't worry about the method right now, it will be covered in detail in the next lecture **String Operators and Methods**.


In [2]:
s = 'Hello World'
s.index('H'), s.index('d'), len(s)

(0, 10, 11)

Notice that the first index = 0, the last index = 10 and the length of the string _s_ = 11

So, the last index = length - 1 = 11 - 1 = 10

In Python, index positions can be positive or negative as shown in the following figure. 
 - positive index count from the first character toward the last.
 - negative index count from the last character back toward the first
 
<img src='../img/index.png' width=500 height=200>

## Slicing strings

You've already learned _string indexing_ and you know that we can use the square brackets [ ], or access operator, to access individual characters within a string. You also know that indexing in Python begins at 0 up to string length minus 1. 

In fact, access operator can be used to extract not only one item or character, but an entire **_slice_** (subsequence) of items or characters, so in this section we will refer to the access operator as the **slice operator**.

The slice operator has 3 syntaxes:

 - seq[start]  ---> same as indexing
 - seq[start:end]
 - seq[start:end:step]

**NOTES**:


- The **seq** can be any sequence, such as string, tuple, or list. 


- The _start_, _end_, and _step_ values must be all integers. 


- The default value of _start_ is 0


- The default value of _end_ is len(seq)


- The default value of _step_ is 1, _step_ value of zero isn’t allowed.

**HAVE FUN** going over each of them in detail with simple examples.

First let's declare a string:


In [2]:
st = "Hello, how are you?"
st

'Hello, how are you?'

Next, we will perform some slicing operations.

### $1^{st}$: syntax seq[start] 

Extracts the start$^{th}$ item from the sequence. We already did that above in the indexing.

In [3]:
st[1]

'e'

In [4]:
st[8]

'o'

### $2^{nd}$ syntax: seq[start:end]

Extracts a slice from and **including** the start$^{th}$ item, up to and **excluding** the end$^{th}$ item.

In [5]:
# start=0 & no end, return everything from character at index 0 onwards to the end of string
st[0:]  

# try st[:], should give you the same result

'Hello, how are you?'

In [6]:
# start=0 & end=14, return characters from letter H (index 0) to letter e (index 13)
st[0:14]  

'Hello, how are'

Now if we type this

In [6]:
st[:10]  # no start & end=10, from letter H (index 0) to letter w (index 9)

'Hello, how'

It is also possible to use a _**negative index**_, what happens is that Python will start counting from the last character back towards the first character. 

Let's try the following examples:

In [7]:
st[:-1]  #same as st[0:] but not including last character '?'

'Hello, how are you'

In [8]:
st[-1:]  #only last character

'?'

### $3^{rd}$ syntax seq[start:end:step]

Like the $2^{nd}$ syntax but instead of extracting every character it extracts every step$^{th}$ character

In [54]:
st[0:14:1] #every character from 1st letter H to letter at index 14

'Hello World!'

In [55]:
st[0:14:2] #EVERY OTHER character from 1st letter H to letter at index 14

'HloWrd'

Sometimes, more than one slicing operations gives the same output.

In [52]:
st = 'Hello World!'

#start at last char, move backwards and extract every other char up to char at index 2
st[-1:2:-2]

'!lo l'

In [53]:
#should give the same output as st[-1:2:-2]
st[:2:-2]

'!lo l'

# Striding Strings

The step parameter used in the third syntax is also called a **stride**. Striding means how can we traverse the characters in the string, like every character, every other character, or every two other characters, ... and so on. **_Python defaults to the stride of 1_**, so that every character between two index numbers is retrieved, for example.

In [9]:
#stepping (striding) by 1 means extract every character from the beginning to the end

st[::1] 

'Hello, how are you?'

In [10]:
st[::]

'Hello, how are you?'

If we use two colons but omit the step size, it will default to 1. There is no point using the two colon syntax with a step size of 1, since that’s the default anyway. 

In [56]:
#stepping by -1 means extract every character from end back to beginning 

st[::-1]  # very useful to reverse a string


'!dlroW olleH'

### Wonderful!

### You have mastered lots of things about strings, in the following lecture you will learn about some string operators and methods.