# String

==> One place where the Python language really shines is in the manipulation of strings. This section will cover some of Python's built-in string methods and formatting operations.

==> Such string manipulation patterns come up often in the context of data science work, and is one big perk of Python in this context.

==> Strings are ordered text based data which are represented by enclosing the same in single/double/triple quotes.

### String syntax

==> You've already seen plenty of strings in examples during the previous lessons, but just to recap, strings in Python can be defined using either single or double quotations. They are functionally equivalent.

In [1]:
x = 'Pluto is a planet'
y = "Pluto is a planet"
x == y

True

==> Double quotes are convenient if your string contains a single quote character (e.g. representing an apostrophe).

==> Similarly, it's easy to create a string that contains double-quotes if you wrap it in single quotes:

In [2]:
print("Pluto's a planet!")
print('My dog is named "Pluto"')

Pluto's a planet!
My dog is named "Pluto"


==> If we try to put a single quote character inside a single-quoted string, Python gets confused:

In [3]:
'Pluto's a planet!'

SyntaxError: invalid syntax (<ipython-input-3-a43631749f52>, line 1)

==> We can fix this by "escaping" the single quote with a backslash.

In [4]:
'Pluto\'s a planet!'

"Pluto's a planet!"

In [5]:
hello = "hello\nworld"
print(hello)

hello
world


==> In addition, Python's triple quote syntax for strings lets us include newlines literally (i.e. by just hitting 'Enter' on our keyboard, rather than using the special '\n' sequence). We've already seen this in the docstrings we use to document our functions, but we can use them anywhere we want to define a string.

In [6]:
triplequoted_hello = """hello
world"""
print(triplequoted_hello)
triplequoted_hello == hello

hello
world


True

==> The print() function automatically adds a newline character unless we specify a value for the keyword argument end other than the default value of '\n':

In [7]:
print("hello")
print("world")
print("hello", end='')
print("pluto", end='')

hello
world
hellopluto

### Strings are sequences

==> Strings can be thought of as sequences of characters. Almost everything we've seen that we can do to a list, we can also do to a string.

In [8]:
# Indexing
planet = 'Pluto'
planet[0]

'P'

In [9]:
# Slicing
planet[-3:]

'uto'

In [10]:
# How long is this string?
len(planet)

5

In [11]:
#we can even loop over them
[char+'! ' for char in planet]

['P! ', 'l! ', 'u! ', 't! ', 'o! ']

==> But a major way in which they differ from lists is that they are immutable. We can't modify them.

In [12]:
planet[0] = 'B'
# planet.append doesn't work either

TypeError: 'str' object does not support item assignment

In [13]:
String0 = 'Taj Mahal is beautiful'
String1 = "Taj Mahal is beautiful"
String2 = '''Taj Mahal
is
beautiful'''

In [14]:
print (String0 , type(String0))
print (String1, type(String1))
print (String2, type(String2))

Taj Mahal is beautiful <class 'str'>
Taj Mahal is beautiful <class 'str'>
Taj Mahal
is
beautiful <class 'str'>


==> String Indexing and Slicing are similar to Lists which was explained in detail earlier.

In [15]:
print(String0[4])
print(String0[4:])

M
Mahal is beautiful


### String methods

==> Like list, the type str has lots of very useful methods. I'll show just a few examples here.

In [16]:
# ALL CAPS
claim = "Pluto is a planet!"
claim.upper()

'PLUTO IS A PLANET!'

In [17]:
# all lowercase
claim.lower()

'pluto is a planet!'

==> **capitalize( )** is used to capitalize the first element in the string.

In [18]:
String3 = 'observe the first letter in this sentence.'
print (String3.capitalize())

Observe the first letter in this sentence.


==> **center( )** is used to center align the string by specifying the field width.

In [19]:
String0.center(70)

'                        Taj Mahal is beautiful                        '

==> One can also fill the left out spaces with any other character.

In [20]:
String0.center(70,'-')

'------------------------Taj Mahal is beautiful------------------------'

==> **zfill( )** is used for zero padding by specifying the field width.

In [21]:
String0.zfill(30)

'00000000Taj Mahal is beautiful'

==> **expandtabs( )** allows you to change the spacing of the tab character. '\t' which is by default set to 8 spaces.

In [22]:
s = 'h\te\tl\tl\to'
print (s)
print (s.expandtabs(1))
print (s.expandtabs())

h	e	l	l	o
h e l l o
h       e       l       l       o


==> **count( )** function counts the number of char in the given string. The start and the stop index can also be specified or left blank. (These are Implicit arguments which will be dealt in functions)

In [23]:
print (String0.count('a',0))
print (String0.count('a',5,10))

4
2


==> **join( )** function is used add a char in between the elements of the input string.

In [24]:
'a'.join('*_-')

'*a_a-'

==> **join( )** function can also be used to convert a list into a string.

In [25]:
a = list(String0)
print (a)
b = ''.join(a)
print (b)

['T', 'a', 'j', ' ', 'M', 'a', 'h', 'a', 'l', ' ', 'i', 's', ' ', 'b', 'e', 'a', 'u', 't', 'i', 'f', 'u', 'l']
Taj Mahal is beautiful


==> Before converting it into a string join( ) function can be used to insert any char in between the list elements.

In [26]:
c = '/'.join(a)[18:]
print (c)

 /i/s/ /b/e/a/u/t/i/f/u/l


==> **replace( )** function replaces the element with another element.

In [27]:
String0.replace('Taj Mahal','Bengaluru')

'Bengaluru is beautiful'

==> **strip( )** function is used to delete elements from the right end and the left end which is not required.

In [28]:
f = '    hello      '

==> If no char is specified then it will delete all the spaces that is present in the right and left hand side of the data.

In [29]:
f.strip()

'hello'

==> **strip( )** function, when a char is specified then it deletes that char if it is present in the two ends of the specified string.

In [30]:
f = '   ***----hello---*******     '

In [31]:
f.strip('*')

'   ***----hello---*******     '

==> The asterisk had to be deleted but is not. This is because there is a space in both the right and left hand side. So in strip function. The characters need to be inputted in the specific order in which they are present.

In [32]:
print (f.strip(' *'))
print (f.strip(' *-'))

----hello---
hello


==> **lstrip( )** and **rstrip( )** function have the same functionality as strip function but the only difference is **lstrip( )** deletes only towards the left side and **rstrip( )** towards the right.

In [33]:
print (f.lstrip(' *'))
print (f.rstrip(' *'))

----hello---*******     
   ***----hello---


==> One can also input **find( )** function between which index values it has to search.

In [34]:
print (String0.find('is'))
print (String0.find('planet'))

10
-1


In [35]:
print (String0.find('i',1))
print (String0.find('n',2,4))

10
-1


In [36]:
# Searching for the first index of a substring
claim.index('plan')

11

In [37]:
claim.startswith(planet)

True

==> **endswith( )** function is used to check if the given string ends with the particular char which is given as input.

In [38]:
claim.endswith('dwarf planet')

False

In [39]:
print (String0.endswith('l',0))
print (String0.endswith('M',0,5))

True
True


==> Going between strings and lists: .split() and .join()

==> str.split() turns a string into a list of smaller strings, breaking on whitespace by default. This is super useful for
==> taking you from one big string to a list of words.

In [40]:
words = claim.split()
words

['Pluto', 'is', 'a', 'planet!']

==> Occasionally you'll want to split on something other than whitespace:

In [41]:
datestr = '1956-01-31'
year, month, day = datestr.split('-')

==> **str.join()** takes us in the other direction, sewing a list of strings up into one long string, using the string it was called on as a separator.

In [42]:
'/'.join([month, day, year])

'01/31/1956'

In [43]:
# Yes, we can put unicode characters right in our string literals :)
' 👏 '.join([word.upper() for word in words])

'PLUTO 👏 IS 👏 A 👏 PLANET!'

==>Building strings with **.format()**


==> Python lets us concatenate strings with the **+** operator.

In [44]:
planet + ', we miss you.'

'Pluto, we miss you.'

==> If we want to throw in any non-string objects, we have to be careful to call **str()** on them first

In [45]:
position = 9
planet + ", you'll always be the " + position + "th planet to me."

TypeError: can only concatenate str (not "int") to str

In [46]:
planet + ", you'll always be the " + str(position) + "th planet to me."

"Pluto, you'll always be the 9th planet to me."

==> This is getting hard to read and annoying to type. **str.format()** to the rescue.

In [47]:
"{}, you'll always be the {}th planet to me.".format(planet, position)

"Pluto, you'll always be the 9th planet to me."

**RAJKUMAR ZALAVADIA - Mo: 7041645834   Email : rajzalavadia50@gmail.com**