# Strings

Strings are sequence of Characters

In Python specifically, strings are a sequence of **Unicode Characters** , which means it can store characters from any language in the world.

- Creating Strings
- Accessing Strings
- Adding Chars to Strings
- Editing Strings
- Deleting Strings
- Operations on Strings
- String Functions

`Unicode` is a computing industry standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard is maintained by the Unicode Consortium.

## Creating Stings

In [None]:
s = 'hello'
s = "hello"
# multiline strings
s = '''hello'''
s = """hello
world
!"""
# s = str('hello')
print(s)

Why multiple ways to create strings?
- Single quotes and double quotes are used to create single line strings.
    - If a string contains a single quote, it is better to use double quotes to avoid escaping the single quote.
    - If a string contains double quotes, it is better to use single quotes to avoid escaping the double quotes.
- Triple quotes are used to create multi-line strings , which can span multiple lines , here also single and double quotes can be used.
- str() is a constructor that can convert other data types to strings.

In [None]:
"it's raining outside"

## Accessing Substrings from a String

### Indexing
- Each character in a string has a unique index associated with it.
- Indexing starts from 0 for the first character, 1 for the second character, and so on.
- Negative indexing is also supported, where -1 represents the last character, -2 represents the second last character, and so on.

In [None]:
# Positive Indexing
s = 'hello world'
print(s[0])  #h
print(s[1])  #e
print(s[2])  #l
print(s[3])  #l
print(s[4])  #o

In [None]:
# Negative Indexing
s = 'hello world'
print(s[-1])  #d
print(s[-2])  #l
print(s[-3])  #r
print(s[-4])  #o
print(s[-5])  #w

### Slicing
- Slicing is used to extract a substring from a string.
- The syntax for slicing is `string[start:end:step]`, where:
    - `start` is the index to start the slice (inclusive).
    - `end` is the index to end the slice (exclusive).
    - `step` is the step size (optional, default is 1).

**Slicing does not modify the original string, it returns a new substring.**

In [None]:
s = "python is awesome"
print(s[0:6])  #python
print(s[7:9])  #is
print(s[10:17])  #awesome

- If `start` is omitted, it defaults to 0.
- If `end` is omitted, it defaults to the length of the string.
- If `step` is omitted, it defaults to 1.

In [None]:
print(s[:6])  #python
print(s[7:])  #is awesome
print(s[:])  #python is awesome

In [None]:
print(s[::2])  #pto saeoe
print(s[::3])  #ph sae
print(s[::4])  #pniwm

- Negative step values can be used to reverse the string.

In [None]:
print(s[::-1])  #emosewa si nohtyp (reversed string) ==> important
print(s[::-2])  #esoai oht
print(s[::-3])  #eoa nyp

In [None]:
print(s[-1:-6:-1])  #emose
print(s[-6:-1])  #aweso
print(s[-6:])  #awesome

In [None]:
print(s[6:0:-1])  #nohtyp
print(s[10:6:-1])  #si
print(s[17:10:-1])  #emosewa
print(s[5:15:2])  #yoi sae
print(s[15:5:-2])  #eoa iyt

- If `start` is greater than `end` and `step` is positive, an empty string is returned , so always ensure that start < end for positive step.
- If `start` is less than `end` and `step` is negative, an empty string is returned. , so always ensure that start > end for negative step.
- If `step` is 0, a `ValueError` is raised.

In [None]:
print("Empty Strings below")
print(s[5:15:-2])  #empty string
print(s[15:5:2])  #empty string
# print(s[5:15:0])  #ValueError

- If `start` or `end` are out of bounds, they are adjusted to fit within the string length.

In [None]:
print(s[20:25])  #empty string
print(s[-25:5])  #pytho
print(s[5:25])  #n is awesome

## Editing and Deleting in Strings

In [None]:
# Editing
s = 'hello world'
s[0] = 'H'  #TypeError: 'str' object does not support item assignment
print(s)

# Python strings are immutable

`strings` are immutable, which means once a string is created, it cannot be modified. Any operation that seems to modify a string actually creates a new string.

In [None]:
# Deleting
s = 'hello world'
del s
print(s)  #NameError: name 's' is not defined

In [None]:
# Deleting a part of string using slicing is also not possible since strings are immutable
s = 'hello world'
del s[-1:-5:2]
print(s)

## Operations on Strings

- `Arithmetic Operations :` It includes concatenation (+) and repetition (*) on strings.
- `Relational Operations :` It includes comparison operators like ==, !=, <, >, <=, >= to compare strings lexicographically.
- `Logical Operations :` It includes logical operators like and, or, not to combine multiple conditions involving strings.
- `Loops on Strings :` It includes iterating over each character in a string using loops like for and while.
- `Membership Operations :` It includes checking if a substring exists within a string using in and not in operators.

#### Arithmetic Operations

In [None]:
print('Rajasthan' + ' ' + 'Haryana')

In [None]:
print('Haryana '*5)

In [None]:
print("*"*50)
# Commonly used to create separators or borders in console output

#### Relational Operations

In [None]:
'Haryana' != 'Haryana'

In [None]:
'Haryana' > 'Rajasthan'
# lexicographically greater than

In [None]:
'Haryana' > 'haryana'
# lexicographically smaller than because ASCII value of 'H' is 72 and 'h' is 104

In [None]:
'Haryana' >= 'Haryana'

#### Logical Operations

In Python, logical operations can be performed on strings using the `and`, `or`, and `not` operators. These operators evaluate the truthiness of the strings involved in the operation.

In python non-empty strings are considered `True` and empty strings are considered `False` .

In [None]:
'Haryana' and 'Rajasthan'
# Here both strings are non-empty, so the result is the last evaluated operand because for `and` operator both operands has to be true hence it checks both and returns the last checked operand .

In [None]:
'Haryana' and ''
# Here the first string is non-empty , but the second string is empty (falsy), so the result is the first false operand because for `and` operator both operands has to be true hence it stops at the first falsy operand and returns it.

In [None]:
'' and 'Rajasthan'

In [None]:
'Haryana' or 'Rajasthan'
# Here the first string is non-empty (True), so the result is the first true operand because for `or` operator only one operand has to be true hence it stops at the first true operand and returns it.

In [None]:
'Rajasthan' or 'Haryana'

In [None]:
'' or 'Haryana'
# Here the first string is empty (false), but the second string is non-empty (true), so the result is the second true operand because for `or` operator only one operand has to be true hence it checks the first operand finds it false and moves to the second operand and returns it.

In [None]:
'Haryana' or ''

In [None]:
not 'hello'

In [None]:
not ""

#### Loops on Strings

In [None]:
for ch in "Haryana":
    print(ch)

In [None]:
for ch in "Harayana":
    print("Rajasthan")
# prints Rajasthan 8 times because there are 8 characters in the string "Harayana"

#### Membership Operations

In [None]:
'n' in 'Haryana'

In [None]:
'n' not in 'Haryana'

## Common Functions
These functions can be used with strings , tuples , lists , sets , dictionaries etc.

- `len() :` Returns the length of the string.
- `min() :` Returns the smallest character in the string based on ASCII value.
- `max() :` Returns the largest character in the string based on ASCII value.
- `sorted() :` Returns a sorted list of characters in the string.

In [None]:
len('hello world')

In [None]:
max('hello world')

In [None]:
min('hello world')
# Here space has the smallest ASCII value of 32

In [None]:
sorted('hello world',reverse=True)  # By default, it sorts in ascending order but here we have used reverse=True to sort in descending order
# returns a list of characters sorted in descending order not a string .

## Capitalize/Title/Upper/Lower/Swapcase

`capitalize()` : It converts the first character to uppercase and the rest to lowercase.

In [None]:
s = 'hello world'
print(s.capitalize())   # it returns a new string and does not modify the original string
print(s)

**Here original string is not modified because strings are immutable.**

`title()` : It converts the first character of each word to uppercase .

In [None]:
s.title()

`upper()` : It converts all characters to uppercase.

In [None]:
s.upper()

`lower()` : It converts all characters to lowercase.

In [None]:
'Hello Wolrd'.lower()

`swapcase()` : It converts uppercase characters to lowercase and vice versa.

In [None]:
'HeLlO WorLD'.swapcase()

## Count/Find/Index

`count(substring)` : It returns the number of occurrences of a substring in the string.

In [None]:
'my name is ritesh swami'.count('i')

`find(substring)` : It returns the lowest index of the substring if found in the string. If not found, it returns -1.

syntax :
    
    `find(substring, start, end)` , where start and end are optional parameters to specify the range to search.
    `index(substring)` : It returns the lowest index of the substring if found in the string. If not found, it raises a `ValueError`.
    `rfind(substring)` : It returns the highest index of the substring if found in the string. If not found, it returns -1.

In [None]:
'my name is ritesh swami'.find('x')

`index(substring)` : It returns the lowest index of the substring if found in the string. If not found, it raises a `ValueError`.

In [None]:
'my name is ritesh swami'.index('x')  #ValueError

## endswith/startswith

`endswith(suffix)` : It returns `True` if the string ends with the specified suffix, otherwise it returns `False`.

In [None]:
'my name is ritesh swami'.endswith('ami')

`startswith(prefix)` : It returns `True` if the string starts with the specified prefix, otherwise it returns `False`.

In [None]:
'my name is ritesh swami'.startswith('my n')

## format
It is used to format strings by embedding variables or expressions within a string.

syntax:
```python
'string with placeholders {}'.format(values)
```
order of values should match the order of placeholders or else we can provide order using index inside {}, it starts from 0.

In [None]:
name = 'Ritesh'
state = 'Haryana'
print('My name is {} and I am from {}'.format(name, state))

In [None]:
name = 'Ritesh'
state = 'Haryana'
'Hi my name is {1} and I am from {0}'.format(name,state)

## isalnum/ isalpha/ isdigit/ isidentifier

`isalnum()` : It returns `True` if all characters in the string are alphanumeric (letters and numbers) and there is at least one character, otherwise it returns `False`.

In [None]:
'Ritesh123'.isalnum()
# True if all characters are alphanumeric (letters and numbers) and there is at least one character, otherwise False .

In [None]:
'Ritesh'.isalnum()

Here all characters are alphabets so it returns True because alphabets are also considered as alphanumeric characters.

In [None]:
'ritesh@123'.isalnum()
# False because of special character '@'

`isalpha()` : It returns `True` if all characters in the string are alphabets and there is at least one character, otherwise it returns `False`.

In [None]:
'nitish'.isalpha()

In [None]:
`Ritesh123`.isalpha()

`isdigit()` : It returns `True` if all characters in the string are digits and there is at least one character, otherwise it returns `False`.

In [None]:
'123abc'.isdigit()

In [None]:
'123456'.isdigit()

`isidentifier()` : It returns `True` if the string is a valid identifier (variable name) in Python, otherwise it returns `False`. A valid identifier must start with a letter (a-z, A-Z) or an underscore (_) and can be followed by letters, digits (0-9), or underscores.

In [None]:
'first-name'.isidentifier()

In [None]:
'_first_name'.isidentifier()

## Split/Join

`split(delimiter)` : It splits the string into a list of substrings based on the specified delimiter. If no delimiter is provided, it splits on whitespace by default.

In [None]:
'my name is ritesh swami'.split()

In [None]:
'my,name,is,ritesh,swami'.split(',')

In [None]:
'my name is ritesh swami'.split('is')
# Here it splits the string at each occurrence of the substring 'is'

`join(iterable)` : It joins the elements of an iterable (like a list or tuple) into a single string, with the specified string as the separator.

In [None]:
" ".join(['hi', 'my', 'name', 'is', 'ritesh'])

In [None]:
"$".join(['hi', 'my', 'name', 'is', 'ritesh'])

## Replace
It replaces all occurrences of a specified substring with another substring.
syntax:
```python
string.replace(old, new, count)
```
- `old` : The substring to be replaced.
- `new` : The substring to replace with.
- `count` : (Optional) The maximum number of occurrences to replace. If not provided, all occurrences are replaced.

In [None]:
'hi my name is ritesh'.replace('ritesh', 'RITESH SWAMI')
# Here it replaces all occurrences of the substring 'ritesh' with 'RITESH SWAMI'

In [None]:
'hi my name is ritesh ritesh'.replace('ritesh', 'RITESH SWAMI', 1)

Always remember we are not changing the original string because strings are immutable, it returns a new string with the replacements made.

## Strip
It removes leading and trailing whitespace (spaces, tabs, newlines) from the string. It can also remove specified characters from both ends of the string.

- Real world use cases:
    - Cleaning user input by removing extra spaces in form fields , especially in names, addresses, etc.
    - Preprocessing text data for analysis or machine learning.
    - Formatting strings for display or storage.
    - Preparing strings for comparison or searching.
syntax:
```python
string.strip(chars)
```
- `chars` : (Optional) A string specifying the set of characters to be removed. If not provided, it removes whitespace by default.

In [None]:
'   hello world   '.strip()

In [None]:
'***hello world***'.strip('*')
# Here it removes all leading and trailing '*' characters from the string

## Questions

Q :  Find the length of a given string without using the len() function

In [None]:
s = input('enter the string')

counter = 0

for i in s:
  counter += 1

print('length of string is',counter)

Q : Extract username from a given email .  like if the email is riteshswami123@gmail.com then the username should be riteshswami123 .

In [None]:
s = input('enter the email')

pos = s.index('@')
print(s[0:pos])

Q : Count the frequency of a particular character in a provided string. ex 'hello how are you' is the string, the frequency of h in this string is 2

In [None]:
s = input('enter the string : ')
term = input('what would like to search for : ')

counter = 0
for i in s:
  if i == term:
    counter += 1

print('frequency',counter)

Q : Write a program which can remove a particular character from a string.

In [None]:
# since strings are immutable we cannot remove a character from a string directly but we can create a new string without that character .
s = input('enter the string')
term = input('what would like to remove')

result = ''

for i in s:
  if i != term:
    result = result + i

print(result)

Q : Check every character in a given string is vowel or not.

In [None]:
name = "Ritesh Swami"
vowel = "AaEeIiOoUu"

for i in name :
    if i in vowel :
        print("{} is a vowel ".format(i))
    else :
        print("{} is not a vowel ".format(i))

Q : Check whether a given string is a palindrome or not. A palindrome is a string which reads the same forwards and backwards. ex: abba, malayalam

In [None]:
s = input('enter the string')
flag = True
for i in range(0,len(s)//2):
  if s[i] != s[len(s) - i -1]:
    flag = False
    print('Not a Palindrome')
    break

if flag:
  print('Palindrome')

Q : Write a program to split a string without using the split() function.

In [None]:
s = input('enter the string')
L = []
temp = ''
for i in s:

  if i != ' ':
    temp = temp + i
  else:
    L.append(temp)
    temp = ''

L.append(temp)
print(L)

Q : Write a python program to convert a string to title case without using the title() function.

In [None]:
s = input('enter the string')

L = []
for i in s.split():
  L.append(i[0].upper() + i[1:].lower())

print(" ".join(L))

Q : Write a program that can convert an integer to string without using the str() function.

In [None]:
its = '0123456789'
result = ''
while number != 0:
  result = digits[number % 10] + result
  number = number//10

print(result)
print(type(result))

Q : Find the indexes of all the letters present in the word "ritesh" but find() should be used once .

In [None]:
s = "my name is Ritesh Swami , but you can just call me by my name Ritesh !"

b = s.find("ritesh")
for i in range(len("ritesh")) :
    print(b+i)

Q : Write a program to find the index of second "name" in the given string 

In [5]:
s = "my name is ritesh swami and my friend's name is rahul"
cnt = 0
for i in s :
    if s.find("name") :
        if cnt == 0 :
            cnt += 1
            continue
        else :
            print(s.find("name",s.find("name")+1))  # here we are providing the starting index for the search to be one more than the first occurrence of "name"
            break

40
