# Python Strings
Reference: 

https://docs.python.org/3/library/stdtypes.html#string-methods

https://docs.python.org/library/string.html#new-string-formatting

* Strings are immutable. Once they are created, they cannot be changed. When string variables are assigned a new value then internally, Python creates a new object to store the value.

* Strings in python are surrounded by either single quotation marks, or double quotation marks.

* 'hello' is the same as "hello".

* You can display a string literal with the **print()** function:

In [1]:
print("Hello")
print('Hello')

Hello
Hello


---
## Define a string

In [2]:
my_string = "Analytics Vidhya creating next generation data science eco-system"

**Multiline Strings**

You can assign a multiline string to a variable by using three quotes:

In [2]:
a = """Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua."""
print(a)

Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.


In [3]:
a = '''Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.'''
print(a)

Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incididunt
ut labore et dolore magna aliqua.


**string format**

In [None]:
'An integer: %i; a float: %f; another string: %s' % (1, 0.1, 'string')

In [None]:
'An integer: {}; a float: {}; another string: {}'.format(1, 0.1, 'string')

In [None]:
f'An integer: {1}; a float: {0.1}; another string: {"string"}'

`f`
* {0:2f} means to format the first argument as a floating-point number with two decimal places
* {1:s} means to format be the second argument as a string
* {2:d} means to format the third argument as an exact integer

The `r` stands for raw - Declare the characters should be interpreted as is

In [5]:
s = r"this\has\no\special\characters"
s

'this\\has\\no\\special\\characters'

**Escape Character**
- To insert characters that are illegal in a string, use an escape character.
- An escape character is a backslash \ followed by the character you want to insert.

In [10]:
txt = "We are the so-called "Vikings" from the north."

SyntaxError: invalid syntax (<ipython-input-10-56cdf4283a8e>, line 1)

In [11]:
# The escape character allows you to use double quotes when you normally would not be allowed:
txt = "We are the so-called \"Vikings\" from the north."
print(txt)

We are the so-called "Vikings" from the north.


**Various Escape Characters**

\'	  Single Quote	

\\	  Backslash	

\n	  New Line	

\r	  Carriage Return	

\t	  Tab	

\b	  Backspace	

\f	  Form Feed	

\ooo	Octal value	

\xhh	Hex value

---
## Modify a String

A string is an immutable object and it is not possible to modify its contents.

In [5]:
a = "hello, world!"
a[2] = 'z' # getting error

TypeError: 'str' object does not support item assignment

One may however create new strings from the original one.

In [6]:
a.replace('l', 'z', 1) # this one create new string

'hezlo, world!'

In [7]:
a.replace('l', 'z') # this one create new string

'hezzo, worzd!'

In [None]:
firstname = 'Christopher'
lastname = 'Brooks'

print(firstname + ' ' + lastname)
print(firstname*3)
print('Chris' in firstname)

In [4]:
greeting = "hello"
name = "bob"
message = greeting + " " + name
message

# Make sure you convert objects to strings before concatenating.
print('Chris' + 2) # getting error
print('Chris' + str(2))

TypeError: can only concatenate str (not "int") to str

### 2. Concatenating Strings

To concatenate, or combine, two strings you can use the `+` operator.

In [6]:
a = "Hello"
b = "World"
c = a + b
print(c)
c = a + " " + b
print(c)

HelloWorld
Hello World


In [None]:
# Concat Strings
'A' + 'B'

Using - `join`

In [5]:
characters = ['p', 'y', 't', 'h', 'o', 'n']
word = "".join(characters)
print(word) # python

python


**Combining** a list of strings into a single one

In [4]:
sentence_list = ["my", "name", "is", "George"]
sentence_string = " ".join(sentence_list)
print(sentence_string)

my name is George


### 3. Capitalize first letters

Using `capitalize`: Update the first character of the string to upper case

In [6]:
my_string = "programming is awesome"
my_string.capitalize()

'Programming is awesome'

Using  - `title`

In [7]:
s = "programming is awesome"

print(s.title()) # Programming Is Awesome

Programming Is Awesome


### 4. Turn the first letter of the given string into lower case/ upper case.

In [2]:
# The upper() and lower() method returns the string in upper case:
a = "Hello, World!"
print(a.upper())
print(a.lower())

HELLO, WORLD!
hello, world!


In [None]:
def decapitalize(str):
    return str[:1].lower() + str[1:]
  
decapitalize('FooBar') # 'fooBar'

### 5. Removes any whitespace from the beginning or the end

In [8]:
# The strip() method removes any whitespace from the beginning or the end:
a = " Hello, World! "
print(a.strip()) # returns "Hello, World!"

Hello, World!


### 6. Replace substrings

In [4]:
# The replace() method replaces a string with another string:
a = "Hello, World!"
print(a.replace("H", "J"))

Jello, World!


In [None]:
"java is easy to learn.".replace('java', 'python')

### 7. Split function
`split` returns a list of all the words in a string, or a list split on a specific character.

In [5]:
# The split() method splits the string into substrings if it finds instances of the separator:
a = "Hello, World!"
print(a.split(",")) # returns ['Hello', ' World!']

['Hello', ' World!']


In [None]:
firstname = 'Christopher Arthur Hansen Brooks'.split(' ')[0] # [0] selects the first element of the list
lastname = 'Christopher Arthur Hansen Brooks'.split(' ')[-1] # [-1] selects the last element of the list
print(firstname)
print(lastname)

In [None]:
sentence_string = "my name is George"
sentence_string.split()
print(sentence_string)

### 8. Repeat string

'A'*3 will repeat A three times:  AAA

In [1]:
'a'*3

'aaa'

### 9. Reversing

In [2]:
x = 'abc'
x = x[::-1]
x

'cba'

### 10. Removing useless characters on the end of your string

In [3]:
name = "  George "
name_2 = "George///"
print(name.strip()) # prints "George"
print(name_2.strip("/")) # prints "George"

George
George


## Checking with String

### String Length
To get the length of a string, use the len() function.

In [6]:
a = "Hello, World!"
print(len(a))

13


 Get length of a string in bytes.

In [5]:
def byte_size(string):
    return(len(string.encode('utf-8')))
    
    
byte_size('😀') # 4
byte_size('Hello World') # 11  

11

### Check if String Contains Substring
**1. Using `in` Operator**

To check if a certain phrase or character is present in a string, we can use the keyword **in**.

In [10]:
txt = "The best things in life are free!"
print("free" in txt)

if 'free' in txt:
    print(bool(1))
else:
    print(bool(0))
    

if 'free' in txt:
    print("Yes")
else:
    print("No")

True
True
Yes


In [1]:
fullstring = "StackAbuse"
substring = "tack"

if substring in fullstring:
    print("Found!")
else:
    print("Not found!")

Found!


Find a substring in the string using `find` and `index` function.

**2. Using `String.index()` Method**

The String type in Python has a method called **index** that can be used to find the starting index of the first occurrence of a substring in a string. If the substring is not found, a ValueError exception is thrown, which can to be handled with a try-except-else block:

* If present it will return the starting index.
* If not found, then it will give error

In [2]:
fullstring = "StackAbuse"
substring = "tack"

try:
    fullstring.index(substring)
except ValueError:
    print("Not found!")
else:
    print("Found!")

Found!


**3. Using `String.find()` Method**

The String type has another method called find which is more convenient to use than index, because we don't need to worry about handling any exceptions. If find doesn't find a match, it returns -1, otherwise it returns the left-most index of the substring in the larger string.

* If present it will return the starting index.
* If not found, then it will return -1

---

In [None]:
fullstring = "StackAbuse"
substring = "tack"

if fullstring.find(substring) != -1:
    print("Found!")
else:
    print("Not found!")  

In [4]:
name = 'farhad'
index = name.find('a', 2) # finds index of second a
index 

4

**3. Using Regular Expressions (REGEX)**

Regular expressions provide a more flexible (albeit more complex) way to check strings for pattern matching. Python is shipped with a built-in module for regular expressions, called re. The re module contains a function called search, which we can use to match a substring pattern as follows:

In [3]:
from re import search

fullstring = "StackAbuse"
substring = "tack"

if search(substring, fullstring):
    print("Found!")
else:
    print("Not found!")

Found!


**4. Check if NOT**

To check if a certain phrase or character is NOT present in a string, we can use the keyword not in.

In [11]:
txt = "The best things in life are free!"
print("expensive" not in txt)

if 'expensive' not in txt:
    print(bool(1))
else:
    print(bool(0))
    

if 'expensive' not in txt:
    print("Yes")
else:
    print("No")

True
True
Yes


### Check if the string is in lower case or upper case.

In [None]:
my_string.islower()

In [None]:
my_string.isupper()

### Check if the string is digit, alpabetic, alpha-numeric.

In [None]:
"10".isnumeric()

In [None]:
"1213as".isnumeric()

In [None]:
"python".isalpha()

In [None]:
"1212as".isalpha()

In [None]:
"1212as".isalnum()

In [8]:
string = "AnalyticsVidhya"
string.isalpha()

True

In [None]:
string = "Analytics Vidhya"
string.isalpha()

### Count of a particular `character` or a `sub-string` in a string.

In [None]:
my_string.count('a')

In [None]:
my_string.count('A')

### Check whether if the string `startswith` or `endswith` a particular substring or not.

In [None]:
my_string.endswith('python')

In [None]:
my_string.endswith('system')

In [None]:
my_string.startswith('python')

In [None]:
my_string.startswith('analytics')

### Anagrams

This method can be used to check if two strings are anagrams. An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.**

In [None]:
from collections import Counter

def anagram(first, second):
    return Counter(first) == Counter(second)


anagram("abcd3", "3acdb") # True

### Checks whether a given string is a palindrome
palindrome means a word, phrase, or sequence that reads the same backward as forward, e.g., madam or nurses run

In [7]:
def palindrome(a):
    return a == a[::-1]


palindrome('mom') # True

True

---
## Discover a String
### Strings are Arrays
Now let's look at strings. Strings are collections like lists. Hence they can be indexed and sliced, using the same syntax and rules. This will return the last element of the string.

Like many other popular programming languages, strings in Python are arrays of bytes representing unicode characters.

However, Python does not have a character data type, a single character is simply a string with a length of 1.

Square brackets can be used to access elements of the string.

In [4]:
a = "Hello, World!"
print(a[1])

e


Since strings are arrays, we can loop through the characters in a string, with a for loop.

In [5]:
for x in "banana":
    print(x)

b
a
n
a
n
a


### Slicing to select string characters

Use bracket notation to slice a string.

You can return a range of characters by using the slice syntax.

Specify the start index and the end index, separated by a colon, to return a part of the string.

In [12]:
b = "Hello, World!"
print(b[2:5])

llo


In [1]:
x = 'This is a string'
print(x[0]) #first character
print(x[0:1]) #first character, but we have explicitly set the end character
print(x[0:2]) #first two characters

T
T
Th


Access characters with negative indexing

In [1]:
b = "Hello, World!"
print(b[-5:-2])

orl


In [2]:
x[-1]
x[-4:-2] # This will return the slice starting from the 4th element from the end and stopping before the 2nd element from the end.
x[:3] # This is a slice from the beginning of the string and stopping before the 3rd element
x[3:] # And this is a slice starting from the 4th element of the string and going all the way to the end.

's is a string'

### Gets vowels (‘a’, ‘e’, ‘i’, ‘o’, ‘u’) found in a string.

In [6]:
def get_vowels(string):
    return [each for each in string if each in 'aeiou'] 


get_vowels('foobar') # ['o', 'o', 'a']
get_vowels('gym') # []

[]

### Finding Unique Elements in a String

In [8]:
my_string = "aavvccccddddeee"

# converting the string to a set
temp_set = set(my_string)

# stitching set into a string using join
new_string = ''.join(temp_set)

print(new_string)

cadve
