## 8 - Strings

## ***8.1 - Introduction***

Strings are amongst the most popular types in Python. We can create them simply by enclosing characters in quotes. Python treats single quotes ' ' the same as double quotes " ".

Creating strings is as simple as assigning a value to a variable. For example:

```
var1 = 'Hello World!'
var2 = "Python Programming"
```

Strings can be concatenated (glued together) with the + operator, and repeated with *. This is another way to create new strings. For example:

In [3]:
word = 'Help' + 'A'
word

'HelpA'

In [2]:
'<' + word*5 + '>'

'<HelpAHelpAHelpAHelpAHelpA>'

Two string literals next to each other are automatically concatenated. The first line above could also have been written <code>word = 'Help' 'A' </code> this only works with two literals, not with arbitrary string expressions.

## ***8.2 - Accessing Elements and Slicing***



Python does not support a character type; these are treated as strings of length one, thus also considered a substring. Individual elements can be accessed with an index. Substrings can be specified with the *slice notation*: two indices separated by a colon. 

Some examples:

In [23]:
word = 'Help' + 'A'
word[4]

'A'

In [24]:
word[0:2]

'He'

In [25]:
word[0:2]

'He'

Slice indices have useful defaults: an omitted first index defaults to zero, an omitted second index defaults to the size of the string being sliced. 

Indices may be negative numbers, to start counting from the right. Continuing from previous example:

In [3]:
word[-1]

'A'

In [4]:
word[-2]  

'p'

In [5]:
word[-2:] 

'pA'

In [6]:
word[:-2]  

'Hel'

The following is an example program with strings.

In [26]:
# Program-8.1

var1 = 'Hello World!'
var2 = "Python Programming"
print("var1[0]: ", var1[0])
print("var2[1:5]: ", var2[1:5])

var1[0]:  H
var2[1:5]:  ytho


### **8.3 - Can We Update Strings?**

Python strings are *immutable*, i.e., *cannot be changed*. Assigning to an indexed position in a string results in an error, as shown below.

In [5]:
word = 'Help' + 'A'
word[0] = 'x'

TypeError: 'str' object does not support item assignment

In [6]:
word[:1] = 'Splat'

TypeError: 'str' object does not support item assignment

However, creating a new string with the combined content is easy and efficient, as shown below.

In [9]:
'x' + word[1:]

'xelpA'

In [10]:
'Splat' + word[4]

'SplatA'

Strings are immutable does not mean we cannot assign a new (separate) string “xyz” to an existing variable which is now assigned a string “pqr”. That means, we can "update" an existing variable, which now has a string value, by (re)assigning it another string. The new value can be related to its previous value or to a completely different string altogether.

The following explains this.

In [11]:
str1 = "pqr"
str2 = str1
str2

'pqr'

In [12]:
str1[0]="x"

TypeError: ignored

In [13]:
str1

'pqr'

In [15]:
str1="xyz"
str1

'xyz'

In [16]:
str2

'pqr'

In [18]:
a = str2
a

'pqr'

In [19]:
id(a)

140191142840688

In [20]:
id(str2)

140191142840688

In [21]:
id(str1)

140191238860272

As shown above, we can use the built-in function *id()* to check if two variables refer to the same thing. The *id()* function returns the identity of an object. This is an integer (or long integer) which is guaranteed to be unique and constant for an object during its lifetime. Two objects may have the same *id()* value, because it is the address of an object in memory.

## ***8.4 - Escape Characters***


Following table shows a list of escape or non-printable characters that can be represented with backslash notation. An escape character gets interpreted in single quoted as well as double quoted strings.


<table align="center">
 <tr>
    <th style="text-align:left">Backslash notation</th>
    <th style="text-align:left">Hexadecimal character</th>
    <th style="text-align:left">Description</th>
  </tr>
  <tr>
      <td style="text-align:left">\a</td>
      <td style="text-align:left">0x07</td>    
      <td style="text-align:left">Bell or alert</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\b</td>
      <td align="left" style="text-align:left">0x08</td>    
      <td align="left" style="text-align:left">Backspace</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\cx</td>
      <td align="left" style="text-align:left"></td>    
      <td align="left" style="text-align:left">Control-x</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\C-x</td>
      <td align="left" style="text-align:left"></td>    
      <td align="left" style="text-align:left">Control-x</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\e</td>
      <td align="left" style="text-align:left">0x1b</td>    
      <td align="left" style="text-align:left">Escape</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\f</td>
      <td align="left" style="text-align:left">0x0c</td>    
      <td align="left" style="text-align:left">Formfeed</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\M-\C-x</td>
      <td align="left" style="text-align:left"></td>    
      <td align="left" style="text-align:left">Meta-Control-x</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\n</td>
      <td align="left" style="text-align:left">0x0a</td>    
      <td align="left" style="text-align:left">Newline</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\nnn</td>
      <td align="left" style="text-align:left"></td>    
      <td align="left" style="text-align:left">Octal notation, where n is in the range 0.7</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\r</td>
      <td align="left" style="text-align:left">0x0d</td>    
      <td align="left" style="text-align:left">Carriage return</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\s</td>
      <td align="left" style="text-align:left">0x20</td>    
      <td align="left" style="text-align:left">Space</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\t</td>
      <td align="left" style="text-align:left">0x09</td>    
      <td align="left" style="text-align:left">Tab</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\v</td>
      <td align="left" style="text-align:left">0x0b</td>    
      <td align="left" style="text-align:left">Vertical tab</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\x</td>
      <td align="left" style="text-align:left"></td>    
      <td align="left" style="text-align:left">Character x</td>    
  </tr>
    <tr>
      <td align="center" style="text-align:left">\xnn</td>
      <td align="left" style="text-align:left"></td>    
      <td align="left" style="text-align:left">Hexadecimal notation, where n is in the range 0.9, a.f, or A.F</td>    
  </tr>
</table>

## ***8.5 - String Special Operators***

Assume variable a holds 'Hello' and variable b holds 'Python', then:

<table align="center">
 <tr>
    <th style="text-align:left">Operator</th>
    <th style="text-align:left">Description</th>
    <th style="text-align:left">Example</th>
  </tr>
  <tr>
      <td align="center" style="text-align:left">+</td>
      <td align="left" style="text-align:left">Concatenation - Adds values on either side of the operator</td>    
      <td align="left" style="text-align:left">a + b will give <b>HelloPython</b></td>    
  </tr>
  <tr>
      <td align="center" style="text-align:left">*</td>
      <td align="left" style="text-align:left">Repetition - Creates new strings, concatenating multiple copies of the same string</td>    
      <td align="left" style="text-align:left">a*2 will give <b>HelloHello</b></td>    
  </tr>
  <tr>
      <td align="center" style="text-align:left">[ ]</td>
      <td align="left" style="text-align:left">Slice - Gives the character from the given index</td>    
      <td align="left" style="text-align:left">a[1] will give <b>e</b></td>    
  </tr>
  <tr>
      <td align="center" style="text-align:left">[ : ]</td>
      <td align="left" style="text-align:left">Range Slice - Gives the characters from the given range</td>    
      <td align="left" style="text-align:left">a[1:4] will give <b>ell</b> </td>    
  </tr>
  <tr>
      <td align="center" style="text-align:left">in</td>
      <td align="left" style="text-align:left">Membership - Returns true if a character exists in the given string</td>    
      <td align="left" style="text-align:left"><b>H in a</b> will give 1</td>    
  </tr>
  <tr>
      <td align="center" style="text-align:left">not in</td>
      <td align="left" style="text-align:left">Membership - Returns true if a character does not exist in the given string</td>    
      <td align="left" style="text-align:left"><b>M not in a</b> will give 1</td>    
  </tr>
  <tr>
      <td align="center" style="text-align:left">r/R</td>
      <td align="left" style="text-align:left">Raw String - Suppresses actual meaning of Escape characters.<br> The syntax for raw strings is exactly the same as for normal <br>strings with the exception of the raw string operator, the letter <br> "r," which precedes the quotation marks. The "r" can be<br> lowercase (r) or uppercase (R) and must be placed immediately<br> preceding the first quote mark.</td>    
      <td align="left" style="text-align:left"><b>print(r'\n')</b> prints \n and <b>print(R'\n')</b> prints \n</td>    
  </tr>
  <tr>
      <td align="center" style="text-align:left">%</td>
      <td align="left" style="text-align:left">Format - Performs String formatting</td>    
      <td align="left" style="text-align:left">See at next section</td>    
  </tr>
</table>

## ***8.6 - String Formatting Operator %***

One of Python's most useful features is the string format operator % (which we have used before). This operator is unique to strings. The following is an example.

In [14]:
name = input("What is your name? ")
age = input("What is your age? ")
print("My name is %s and I am %d years old!" % (name, int(age)))

What is your name? Kumar
What is your age? 25
My name is Kumar and I am 25 years old!


From python 3.6 onwards, F-strings have been [introduced](https://www.python.org/dev/peps/pep-0498/) as an improved way to format strings. Above statement can be written using f-strings as below:

In [15]:
print(f"My name is {name} and I am {age} years old!")

My name is Kumar and I am 25 years old!


Here is the list of complete set of symbols which can be used along with %:

<table align="center">
 <tr>
    <th style="text-align:left">Format Symbol</th>
    <th style="text-align:left">Conversion</th>
 
  </tr>
  <tr>
      <td align="center" style="text-align:left">%c</td>
      <td align="left" style="text-align:left">character</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%s</td>
      <td align="left" style="text-align:left">string conversion via str() prior to formatting</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%i</td>
      <td align="left" style="text-align:left">signed decimal integer</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%d</td>
      <td align="left" style="text-align:left">signed decimal integer</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%u</td>
      <td align="left" style="text-align:left">unsigned decimal integer</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%o</td>
      <td align="left" style="text-align:left">octal integer</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%x</td>
      <td align="left" style="text-align:left">hexadecimal integer (lowercase letters)</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%X</td>
      <td align="left" style="text-align:left">hexadecimal integer (UPPERcase letters)</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%e</td>
      <td align="left" style="text-align:left">exponential notation (with lowercase 'e')</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%E</td>
      <td align="left" style="text-align:left">exponential notation (with UPPERcase 'E')</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%f</td>
      <td align="left" style="text-align:left">floating point real number</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%g</td>
      <td align="left" style="text-align:left">the shorter of %f and %e</td>    
       
  </tr>
   <tr>
      <td align="center" style="text-align:left">%G</td>
      <td align="left" style="text-align:left">the shorter of %f and %E</td>    
       
  </tr>
</table>




Other supported symbols and functionality are listed in the following.

<table align="center">
 <tr>
    <th style="text-align:left">Symbol</th>
    <th style="text-align:left">Functionality</th>
 
  </tr>
  <tr>
      <td align="center" style="text-align:left">*</td>
      <td align="left" style="text-align:left">argument specifies width or precision</td>     
  </tr>
  <tr>
      <td align="center" style="text-align:left">-</td>
      <td align="left" style="text-align:left">left justification</td>     
  </tr>
  <tr>
      <td align="center" style="text-align:left">+</td>
      <td align="left" style="text-align:left">display the sign</td>     
  </tr>
  <tr>
      <td align="center" style="text-align:left">&#60sp&#62</td>
      <td align="left" style="text-align:left">leave a blank space before a positive number</td>     
  </tr>
  <tr>
      <td align="center" style="text-align:left">#</td>
      <td align="left" style="text-align:left">add the octal leading zero ( '0' ) or hexadecimal leading '0x' or '0X', depending on whether 'x' or 'X' were used.</td>     
  </tr>
  <tr>
      <td align="center" style="text-align:left">0</td>
      <td align="left" style="text-align:left">pad from left with zeros (instead of spaces)</td>     
  </tr>
  <tr>
      <td align="center" style="text-align:left">%</td>
      <td align="left" style="text-align:left">'%%' leaves you with a single literal '%'</td>     
  </tr>
  <tr>
      <td align="center" style="text-align:left">(var)</td>
      <td align="left" style="text-align:left">mapping variable (dictionary arguments)</td>     
  </tr>
  <tr>
      <td align="center" style="text-align:left">m.n.</td>
      <td align="left" style="text-align:left">m is the minimum total width and n is the number of digits to display after the decimal point (if appl.)</td>     
  </tr>
</table>