<h2> STRING </h2>
String can be used to represent anything that can be coded in text or bytes. Text can include all symbols and words(eg. name), content of files loaded in memory, Internet addresses, Python source code, etc.  Strings can also be used to hold the raw bytes used for media files and network transfers, and both the encoded and decoded forms of non ASCII Unicode text used in internationalized programs. Python's strings somewhat resemble character array in languages such as C. But unlike C Python's strings come with a powerful set of processing tools and also it does not have a distinct type for individual characters instead we use one-character strings. Python strings are characterised as immutable sequences which means they have a certain left-to-right positional arrangement that cannot be changed. Strings are the first representatives of the larger class of objects called sequences.
<br>
STRING LITERALS
<br>
There are a number of ways to write stings in Python. They are listed below:
<br>
• Single quotes: 'spa"m' 
<br>• Double quotes: "spa'm" 
<br>• Triple quotes: '''... spam ...''', """... spam ...""" 
<br>• Escape sequences: "s\tp\na\0m" 
<br>• Raw strings: r"C:\new\test.spm" 
<br>• Bytes literals in 3.X and 2.6+ : b'sp\x01am' 
<br>• Unicode literals in 2.X and 3.3+: u'eggs\u0020spam'
<br>
The single and double-quoted forms are commonly used while the rest serve specialized roles.
<br>
<h2> Single and Double-Quoted Strings Are the Same </h2>

In Pyhton strings, single and double-quoted characters function the same way and are interchangeable. For example:


In [4]:
>>> "happy",'happy'

('happy', 'happy')

The reason for allowing both is to allow the use of quote character of the other variety inside a string without escaping it with a backlash.You may embed a single-quote character in a string enclosed in double-quote characters, and vice versa: 

In [5]:
>>> 'knight"s', "knight's"  

('knight"s', "knight's")

Note that the comma is important here. Without it, Python automatically concatenates adjacent string literals in any expression, although it is almost as simple to add a + operator between them to invoke concatenation

In [7]:
>>> a = "meaning " 'of' " life"
>>> a

'meaning of life'

If we add commas in between it would result in a tuple, not a string. Here the string is printed in single quotes unless there is an embed quote. We can also embed quote characters using backslashes to escape them:


In [9]:
>>> a = "today\'s" , 'today\"s'
>>> a

("today's", 'today"s')

<h2> Escape Sequences Represent Special Characters </h2>
The last example embedded a quote inside a string by preceding it with a backslash. This is representative of a general pattern in strings: backslashes are used to introduce special character codings known as 'escape sequences'. When we use the character \ and one or more characters following it in the literal string, they are replaced by a single string object which has binary value specified by the escape sequence. For example, a five-string character that embeds a newline and a tab:



In [11]:
>>> a = 'a\nb\tc'

The way the string appears on printing depends on how it is printed.The interactive echo shown below shows the special characters as escapes, but 'print' interprets them instead

In [12]:
>>> a

'a\nb\tc'

In [13]:
>>> print(a)

a
b	c


The built-in len function shows the actual number of characters in the string irrespective of how it is coded or displayed.

In [14]:
>>> len(a)

5

Absolute binary values can also be embedded into the characters of a string.For instance, here’s a five-character string that embeds two characters with binary zero values:

In [15]:
>>> a = 'a\0b\0c' 
>>> a

'a\x00b\x00c'

In [16]:
>>> len(a)

5

In Python a null character does not terminate the string the way a null byte does in C. Pyhton keeps both the strings text and length in memory. No character terminates a string in Python.Python displays non-printable characters in hexadecimal irrespective of how they coded. This is important when processing binary data files in Python . Because their contents are represented as strings in your scripts, it’s OK to process binary files that contain any sorts of binary byte values—when opened in binary modes, files return strings of raw bytes from the external file.
<br>
If Python does not recognize the character after a \ as being a valid escape code, it keeps the backlash in the resulting string and displays it as \\.

In [19]:
>>> s = "C:\py\code"           # Keeps \ in the result string 
>>> s 

'C:\\py\\code'

In [18]:
>>> len(s)

10

<h2> Raw Strings Suppress Escapes </h2>
Sometimes when backslashes for escape sequences are used they may not give the desired output. For instance if we try opening a file as:
<br>
myfile = open('C:\new\text.dat', 'w')
<br>
to open a file called text.dat in directory C:/new the \n and \t get replaced as new line and tab respectively.
 If however the letter r (uppercase or lowercase) appears just before the opening quote of a string, it turns off the escape mechanism. The result of this is that Python retains the backslash exactly where it's supposed to be. For example:
 <br>
myfile = open(r'C:\new\text.dat', 'w')
<br>
Alternately, two backslashes can be used together since the second one is used to escape the first one.
<br>
myfile = open('C:\\new\\text.dat', 'w') 
<br>
<h2> Triple Quotes Code Multiline Block Strings </h2>
Python has a triple-quoted string format referred to as 'block string', that is a syntactic convenience for coding multiline text data. It begins with three quotes, either single or double, followed by any number of lines of text and is closed with the same triple quote that opened it. Single and double quotes embedded in the string’s text may be, but do not have to be, escaped. The string does not end until Python sees three unescaped quotes of the same kind used to start the literal.

In [22]:
>>> note = """This is
Python
code."""
>>> note

'This is\nPython\ncode.'

Pyhton codes all the three lines of triple-quoted text into a single multi-line string with embedded newline characters everytime there's a new line break in the code. To see the code exactly the way it is typed print it instead of echoing it.


In [23]:
>>> print(note)

This is
Python
code.


In fact, triple-quoted strings will retain all the enclosed text, including any to the right of your code that you might intend as comments. To avoid this put comments above or below the quoted text. Triple-quoted strings are useful anytime you need multiline text in your program; for example, to embed multiline error messages or HTML, XML, or JSON code in your Python source code files.
<br>
Triple-quoted strings are also commonly used for documentation strings, which are string literals that are taken as comments when they appear at specific points in your file. Triple-quoted strings are used to disable certain lines of code during development.
<br>
<h2> Strings in Action </h2>
The actions here refer to string expression, methods and formatting.
<br>
BASIC OPERATIONS:
<br>
Strings can be concatenated using the + operator and repeated using * operator.

In [24]:
>>> a = 'abc'
>>> len(a)

3

In [25]:
>>> a + 'def'         #concatenation of  a string

'abcdef'

In [26]:
>>> a * 4              #repetition of string 'abc'

'abcabcabcabc'

Operator overloading is used here as the same + is used for add as well as concatenation and the * operator is used for multiplication as well as repetition. One cannot mix numbers and strings using + operator, it gives an error.

In [27]:
>>> 'abc' + 9

TypeError: must be str, not int

Strings can also be iterated in loops using for statements. This repeats the action as well as tests membership for characters and substrings using the 'in' expression operator. The latter is a search which returns a Boolean result.

In [40]:
>>> note = 'mypythoncode'
>>> for i in note: print(i, end='')

mypythoncode

The for loop assigns a variable to successive items in a sequence and executes one or more statements each item. The variable i here acts as a cursor which steps across each item of the string.

In [37]:
>>> 'm' in note

True

In [39]:
>>> 'r' in note

False

INDEXING AND SLICING
<br>
Strings are ordered lists of characters and hence we can access its components by position. In Python, characters in a string are fetched by indexing i.e. providing the numeric offset of the desired component in square brackets after the string. The output is the one-character string at the specified position. Like C, Python allows you to set offset from 0 till the end which is one less than the string length. However, unlike C Python allows the use of negative offsets. The negative offset is added to the length of the string to derive a positive offset. The negative offset can be thought of as counting from the back.

In [41]:
>>> s = 'python'           #positive and negative indexing
>>> s[0], s[-3]

('p', 'h')

In [42]:
>>> s[:3], s[2:4], s[3:]

('pyt', 'th', 'hon')

Slicing is a generalised form of indexing returns and entire section of the string and not a single item. The second part of code shows slicing. When nothing is mentioned before the semicolon it is considered as default start and nothing mentioned at the end considers everything till the end.  Probably the best way to think of slicing is that it is a type of parsing (analyzing structure), especially when applied to strings—it allows us to extract an entire section (substring) in a single step. Slices can be used to extract columns of data, chop off leading and trailing text, and more. The basics of slicing are straightforward. When you index a sequence object such as a string on a pair of offsets separated by a colon, Python returns a new object containing the contiguous section identified by the offset pair. The left offset is taken to be the lower bound (inclusive), and the right is the upper bound (noninclusive). Everything from the lower boundary is considered but the upper boundary is not included.
<br>
EXTENDED SLICING: THE THIRD LIMIT AND SLICE OBJECTS
<br>
Python has support for an optional third index slice expression used as a step known as a stride.  It is represented by X[I:J:k]. This fetches the values from I through J-1. The value of K can be used to skip numbers at specific intervals or to reverse the order.

In [44]:
>>> x = '123456789'
>>> x[1:8:2]

'2468'

Here every second number is fetched between the offsets 1-8. It collects the offsets 1,3,5 and 7. Using negative stride would reverse the order of the string. When reversing a stride the first two bounds also need to be reversed

In [52]:
>>> x[8:0:-1]

'98765432'

<h2> String Conversion Tools </h2>
Since a number and a string cannot be added together we use conversion tools before treating a number as strings or vice versa. int function converts string to a number and str converts number to a string. The repr function converts to string object representation as an object of code that can be rerun to create the object


In [56]:
>>> int('24'), str(24)          #converting to integer, converting  to string

(24, '24')

In [59]:
>>> repr(24)                    #coverting to string

'24'

Operands can be manually converted to mix strings and number types. This conversion can be carried out in the same way for floating point numbers.

In [62]:
>>> s = '24'   
>>> t = 3
>>> int(s) + t         #addition

27

In [66]:
>>> s + str(t)          #concatenation

'243'

In [71]:
>>> str(234.5), float('6.7')

('234.5', 6.7)

CHARACTER CODE CONVERSIONS
<br>
Characters can be converted to their corresponding integer code(ASCII value) which is stored in the memory using the built-in ord <b> ord </b> function. The <b> chr </b> function performs the reverse by taking the integer and converting it into the corresponding character.

In [91]:
>>> ord('a')

97

In [93]:
>>> chr(97)

'a'

The <b> ord </b> and <b> chr </b> function can be used to perform string math.

In [97]:
>>> S = '6'
>>> S = chr(ord(S) + 1) 
>>> S

'7'

In [99]:
>>> S = chr(ord(S) + 1) 
>>> S

'8'

It provides an alternative to using int to convert from string to integer for sinlge character strings.

In [104]:
>>> int('5')

5

In [115]:
>>> ord('5') - ord('0')

5

<b>CHANGING STRINGS </b>
<br>
To change a string we need to build and assign a new string using string tools such as concatenation and slicing and then assign the result back to the strings original name.

In [117]:
>>> s = 'python'
>>> s = s + 'code!!'
>>> s

'pythoncode!!'

In [120]:
>>> s[:6] + 'language' + s[-1] + s[-2]

'pythonlanguage!!'

Changing strings can also be done by using string method <b> replace </b> 

In [123]:
>>> s = s.replace('de','des')
>>> s

'pytholcodes!!'

It’s also possible to build up new text values with string formatting expressions. It substitutes objects into strings by converting the object into strings and changing original string according to the format of the string.

In [126]:
>>> 'The %s is %s !' % ('sun','star')           


'The sun is star !'

In [131]:
>>> 'That {0} is a {1} !'.format('sun', 'star') 

'That sun is a star !'

<h2> String Methods </h2>
In Python, methods are object type specific and do not work across a range of types.
<br>
METHOD CALL SYNTAX
<br>
Methods are functions that are associated with and act upon particular objects. 
<br>
<b> Attribute Fetches </b>
<br>
Expression of the form object.attribute which fetches the value of attribute in object.
<br>
<b> Call Expressions </b>
<br>
Expression of the form function(arguements) which invokes the code of function, passing zero or more comma-separated argument objects to it, and return function’s result value.
<br>
Putting attribute fetches and call expression together we get:
<br>
object.method(arguements).
<br>
Python will first call the method to process objects with arguements. Python will first call the method to process objects with arguements. 

<b> CHANGING STRINGS 2 </b>
<br>
We can use the built-in function list for changing strings. Once in form of list changes can be made to it without generating new copy for each change.


In [133]:
>>> s = 'python'
>>> l = list(s)
>>> l

['p', 'y', 't', 'h', 'o', 'n']

In [137]:
>>> l[4] = 'a'
>>> l[5] = 'l'
>>> l

['p', 'y', 't', 'h', 'a', 'l']

<b> PARSING TEXT </b>
<br>
Another role of string methods is text parsing that is analizing structure and extracting substrings.

In [149]:
>>> l = 'aaabbbccc'
>>> a = l[:3]                    #extract substring at fixed offset
>>> b = l[6:]
>>> a

'aaa'

In [150]:
>>> b

'ccc'

In [153]:
>>> l = 'aaa bbb ccc'        #no delimiter
>>> c = l.split()             #spliting at arbitrary positions within the string
>>> c

['aaa', 'bbb', 'ccc']

In [154]:
>>> l = 'aaa,bbb,ccc'        #delimiter
>>> c = l.split(',')
>>> c

['aaa', 'bbb', 'ccc']

<h2> String Formatting Expressions </h2>
String formatting allows us to perform multiple type-specific substitutions on a string in a single step. String formatting is of two types:
<br>
<ul>
    <li> String formatting expressions:</li>
        The original technique available since Python’s inception, this form is based upon the C language’s “printf” model, and sees widespread use in much existing code.
    <li> String formatting method calls: </li>
           A newer technique derived in part from a same-named tool in C#/.NET, and overlaps with string formatting expression functionality.
    </ul>
    <br>
 <b> Formatting Expression </b>
    <br>
  When applied to strings, the % operator provides a simple way to format values as strings according to a format definition. The % operator provides a compact way to code multiple string substitutions all at once, instead of building and concatenating parts individually.
  <br>
  To format strings:
  <br>
  1.On the left of the % operator, provide a format string containing one or more embedded conversion targets, each of which starts with a % (e.g., %d).

2.On the right of the % operator, provide the object (or objects, embedded in a tuple) that you want Python to insert into the format string on the left in place of the conversion target (or targets). 


In [157]:
>>> 'This is %s %s' %('python','code')

'This is python code'

<b> Dictionary Based Formatting expressions </b>
<br>
String formatting also allows conversion targets on the left to refer to the keys in a dictionary coded on the right and fetch the corresponding values. 

In [162]:
>>> '%(quantity)d more %(python)s' % {'quantity':1, 'python':'code'}

'1 more code'

<h2> String Formatting Method Calls </h2>
 Unlike formatting expressions, formatting method calls are not closely based upon the C language’s “printf” model, and are sometimes more explicit in intent but it still relies on type codes and formatting specifications. 
 <br>
 <b> Formatting Method Basics </b>
 <br>
 It is based on normal function call syntax. It uses the subject string as template and takes any number of arguements that represent values to be substituted according to the template.

In [163]:
>>> t = '{0},{1} and {2}'              #by position
>>> t.format('book','pen','knowledge')

'book,pen and knowledge'

In [173]:
>>> t =  '{book}, {pen} and {knowledge}'          #by keyword
>>> t.format(book = 'information', pen = 'resource', knowledge = 'aim')
>>> t

'{book}, {pen} and {knowledge}'

<b> Advanced Formatting Method Syntax </b>
<br>
For the formatting method, we use a colon after the possibly empty substitution target’s identification, followed by a format specifier that can name the field size, justification, and a specific type code. The structure consists of four parts:
<br>
<ul>
    <li> • <b> fieldname </b> is an optional number or keyword identifying an argument.</li> 
    <li> • <b> component </b> is a string of zero or more “.name” or “[index]” references used to fetch attributes and indexed values of the argument, which may be omitted to use the whole argument value. </li> 
    <li> • <b> conversionflag </b> starts with a ! if present, which is followed by r, s, or a to call repr, str, or ascii built-in functions on the value, respectively. </li>
    <li> • <b> formatspec </b> starts with a : if present, which is followed by text that specifies how the value should be presented, including details such as field width, alignment, padding, decimal precision, and so on, and ends with an optional data type code. </li>
    </ul>
    The formatspec component is described as follows:
    <br>
    [[fill]align][sign][#][0][width][,][.precision][typecode]

    

In [174]:
>>> '{0:10} = {1:10}'.format('spam', 123.4567)      

'spam       =   123.4567'

<h2> General Type Categories </h2>
<b> Types Share Operation Sets by Categories </b>
<br> Strings are immutable sequences and they are positonally ordered as well and can be accessed using their offset. There are 3 major type catagories in Python:
<br>
<ul>
    <li> <b> Numbers (integer, floating-point, decimal, fraction, others): </b>Support addition, multiplication </li> 
    <li> <b> Sequences (strings, lists, tuples): </b> Support indexing, slicing, concatenation </li>
    <li> <b> Mappings (dictionaries) </b> Support indexing by key</li>
    </ul>
<h2> Mutable Types Can Be Changed in Place </h2>
IMMUTABLES:<br>
Immutable types do not support in-place changes.
<br>
MUTABLES:<br>
Mutable types can be changed in place with operations without creating new objects.

