# Lab 3: Data Types Basics
DA108: Python Programming
Author: Neeraj Sharma, IIT Guwahati


# Strings
Strings in Python serve to store text data, like names, address, DNA sequences, and so on. These qre immensely useful data types.
They are essentially character sequences. Once they are assigned to a variable, Python keeps track of each element in the string in a specific order. For instance, Python interprets the string "India" as a sequence of letters arranged in a particular order. Consequently, we can utilize indexing to access specific letters, such as the first or last one.

Understanding the concept of a sequence is pivotal in Python.
In this session, we will cover the following topics:
Creating Strings
* Printing Strings
* Differences in Printing between Python 2 and Python 3
* String Indexing and Slicing
* String Properties
* String Methods
* Print Formatting


In [3]:
# Creating a String

my_string = "Hello, World!"
another_string = 'Python Programming'
multi_line_string = """This is a 
multi-line string."""
single_quote_string = 'It\'s raining outside.'

# Printing the assigned strings

print(my_string)
print(another_string)
print(multi_line_string)
print(single_quote_string)
# Error-prone assignments
# Assigning a string with mismatched quotes
# error_string = "This is an error string.'
# Uncommenting the line above will result in a SyntaxError


Hello, World!
Python Programming
This is a 
multi-line string.
It's raining outside.


In [None]:
# Single word
'hello'

# Entire phrase 
'This is also a string'

# We can also use double quote
"String built with double quotes"

# Be careful with quotes!
' I'm using single quotes, but will create an error'


"Now I'm ready to use the single quotes inside a string!"

"""This is a string with triple qoutes"""

"""The man said "What is that going on over there!" """

"""The man said "What is that going on over there!""""

'There went "The Animal"'


# Printing a String
In Python, strings are printed using the print() function. This function is used to display the content of strings to the console.

In [None]:
print("DA108: Python programming!")

# Escape characters
Escape characters in Python are special sequences of characters that represent other characters or add specific formatting to strings. Here are some commonly used escape characters:
* `\n`: Newline - Inserts a newline character.
* `\t`: Tab - Inserts a tab character.
* `\\`: Backslash - Inserts a single backslash character.
* `\'`: Single Quote - Inserts a single quote character.
* `\"`: Double Quote - Inserts a double quote character.

These can also made part of a string!

In [2]:
# Newline character
print("Hello\nWorld")

# Tab character
print("Hello\tWorld")

# Backslash character
print("This is a backslash: \\")

# Single quote and double quote characters
print('She said, "Hello"')
print("He said, 'Hi'")
print('I am typing in Neeraj\'s Python IDE')

Hello
World
Hello	World
This is a backslash: \
She said, "Hello"
He said, 'Hi'
I am typing in Neeraj's Python IDE


# String Basics
* Variable assignment with a string
* String operation: concatenation and repeatation
* String indexing and slicing
* String Storage
* String Methods

## Variable assignment with a string
Variable assignment with a string in Python involves assigning a sequence of characters enclosed within single, double, or triple quotes to a variable.

In [11]:
single_quoted_string = 'Hello, World!'
double_quoted_string = "Python Programming"
double_quoted_with_single_quote = "It's raining outside."
escaped_string = "This is a newline\nand this is a tab\tand this is a backslash \\"
# print(escaped_string)

## String Indexing
1. Strings are indexed from $0$ to $n-1$, where $n$ is the length of the string.
2. Individual characters can be accessed using square brackets $[ ]$.
3. Negative indexing is also supported, where -1 refers to the last character, -2 refers to the second last character, and so on.
4. Slicing allows you to extract substrings from a string using the syntax $[start:end:step]$

In [1]:
s = "Python"
print(s[0])     # Output: 'P'
print(s[-1])    # Output: 'n'
print(s[1:4])   # Output: 'yth'
print(s[::2])   # Output: 'Pto'

P
n
yth
Pto


In [2]:
# Original string
original_string = "Python Programming"

# Positive indexing (forward slicing)
substring_1 = original_string[7:18]   # "Programming"
substring_2 = original_string[0:6]    # "Python"

# Negative indexing (backward slicing)
substring_3 = original_string[-11:-1] # "Programmin"
substring_4 = original_string[-18:-7] # "Python"

# Omitted start or end index
substring_5 = original_string[:6]     # "Python"
substring_6 = original_string[7:]     # "Programming"

# Using step value
substring_7 = original_string[::2]    # "Pto rgamn"
substring_8 = original_string[::-1]   # "gnimmargorP nohtyP"

# Print results
print("Substring 1:", substring_1)
print("Substring 2:", substring_2)
print("Substring 3:", substring_3)
print("Substring 4:", substring_4)
print("Substring 5:", substring_5)
print("Substring 6:", substring_6)
print("Substring 7:", substring_7)
print("Substring 8:", substring_8)


Substring 1: Programming
Substring 2: Python
Substring 3: Programmin
Substring 4: Python Prog
Substring 5: Python
Substring 6: Programming
Substring 7: Pto rgamn
Substring 8: gnimmargorP nohtyP


## String storage
in Python, the characters of a string are stored in contiguous memory locations. This means that the characters of a string are stored one after another in memory, allowing for efficient access to individual characters and substrings. This contiguous storage allows Python to perform operations on strings quickly and efficiently.

In [7]:

a = "hello"
b = 'h'
print(id(b))
print(id(a))
print(id(a[0]))
print(id(a[1]))


140705710221168
140705643831344
140705710221168
140705710629488


In [8]:
# Define a string
my_string = "hello"

# Print the identity of the string object
print("Identity of my_string:", id(my_string))

# Iterate through each character in the string
for char in my_string:
    # Print the character and its identity
    print("Character:", char, "Identity:", id(char))


Identity of my_string: 140705643831344
Character: h Identity: 140705710221168
Character: e Identity: 140705710629488
Character: l Identity: 140705710182448
Character: l Identity: 140705710182448
Character: o Identity: 140705710852272


## String Immutability
It is important to note that strings have an important property known as immutability. This means that once a string is created, the elements within it can not be changed or replaced. Immuntiblity is a concept seen in other data types.
Immutability means that the data cannot mutate, in other words, the data cannot change. However, a new 'data object' can be created using the current object.

In [13]:
# Original string
original_string = "Python"

# Attempt to change the first character
original_string[0] = 'J'  # This will raise an error

# Concatenating a new string
new_string = original_string + " is awesome!"

# Printing original and new strings
print("Original string:", original_string)
print("New string:", new_string)


TypeError: 'str' object does not support item assignment

## String Concatenation

In [14]:
# String concatenation
str1 = "Hello"
str2 = "World"
concatenated_string = str1 + " " + str2
print("Concatenated string:", concatenated_string)

# String repetition
original_str = "Python "
repeated_str = original_str * 3
print("Repeated string:", repeated_str)


Concatenated string: Hello World
Repeated string: Python Python Python 


## String Methods

In [16]:
# Original string
my_string = "hello, world!"

# Methods applicable to strings
# 1. Capitalize the first letter
print("Capitalized:", my_string.capitalize())

# 2. Convert to uppercase
print("Uppercase:", my_string.upper())

# 3. Convert to lowercase
print("Lowercase:", my_string.lower())

# 4. Count occurrences of a substring
print("Occurrences of 'l':", my_string.count('l'))

# 5. Check if string starts with a particular substring
print("Starts with 'hello':", my_string.startswith('hello'))

# 6. Check if string ends with a particular substring
print("Ends with 'world!':", my_string.endswith('world!'))

# 7. Find the index of a substring
print("Index of 'world!':", my_string.find('world!'))

# 8. Replace occurrences of a substring
print("Replaced:", my_string.replace('hello', 'hi'))

# 9. Split the string into a list of substrings
print("Split:", my_string.split(','))

# 10. Strip leading and trailing whitespace
my_string_with_whitespace = "  hello, world!  "
print("Stripped:", my_string_with_whitespace.strip())

# 11. Check if all characters are alphanumeric
print("Is alphanumeric:", my_string.isalnum())

# 12. Check if all characters are alphabetic
print("Is alphabetic:", my_string.isalpha())

# 13. Check if all characters are digits
print("Is digit:", my_string.isdigit())


Capitalized: Hello, world!
Uppercase: HELLO, WORLD!
Lowercase: hello, world!
Occurrences of 'l': 3
Starts with 'hello': True
Ends with 'world!': True
Index of 'world!': 7
Replaced: hi, world!
Split: ['hello', ' world!']
Stripped: hello, world!
Is alphanumeric: False
Is alphabetic: False
Is digit: False


# DNA Sequence Handling

In [18]:
# assign DNA sequence to a variable
DNA_seq = "ATGTACTC ATTCGTTTCG GAAGAGACAG GTACGTTAAT AGTTAATAGC GTACTTCTTT TTCTTGCTTT CGTGGTATTC TTGCTAGTTA CACTAGCCAT CCTTACTGCG CTTCGATTGT GTGCGTACTG CTGCAATATT GTTAACGTGA GTCTTGTAAA ACCTTCTTTT TACGTTTACT CTCGTGTTAA AAATCTGAAT TCTTCTAGAG TTCCTGATCT TCTGGTCTAA"