# Hyperlearning AI - Introduction to Python
An introductory course to the Python 3 programming language, with a curriculum aligned to the Certified Associate in Python Programming (PCAP) examination syllabus.<br/>
https://hyperlearning.ai/knowledgebase/courses/introduction-python


## 4. Data Aggregates Part 1
https://hyperlearning.ai/knowledgebase/courses/introduction-python/modules/4/data-aggregates-part-1

In this module we will cover advanced operations with strings and lists, namely:

* **Advanced Strings** - immutability, encodings, escape sequences, string comparisons, multi-line strings, slicing, copying, cloning, and other common string methods and functions
* **Advanced Lists** - indexing, slicing, iterating, list comprehension, copying, cloning, and other common list methods and functions

### Advanced Strings
#### Character Encodings

In [3]:
# Include a Unicode character in a Python 3 string literal
hospital_in_french = 'hôpital'
hospital_in_japanese = '病院'
hospital_in_arabic = 'مستشفى'
print(f'The name for Hospital in French is: {hospital_in_french}')
print(f'The name for Hospital in Japanese is: {hospital_in_japanese}')
print(f'The name for Hospital in Arabic is: {hospital_in_arabic}')

The name for Hospital in French is: hôpital
The name for Hospital in Japanese is: 病院
The name for Hospital in Arabic is: مستشفى


In [5]:
# Include a Unicode character in a Python 3 identifier
日本の人口 = 126_500_000
print(f'The population of Japan is: {日本の人口}')

The population of Japan is: 126500000


In [6]:
# Usage of the encode() string method
directions = 'The name of the hospital is Charité - Universitätsmedizin Berlin'
print(directions.encode(encoding="ascii", errors="backslashreplace"))
print(directions.encode(encoding="ascii", errors="ignore"))
print(directions.encode(encoding="ascii", errors="namereplace"))
print(directions.encode(encoding="ascii", errors="replace"))
print(directions.encode(encoding="ascii", errors="xmlcharrefreplace"))

b'The name of the hospital is Charit\\xe9 - Universit\\xe4tsmedizin Berlin'
b'The name of the hospital is Charit - Universittsmedizin Berlin'
b'The name of the hospital is Charit\\N{LATIN SMALL LETTER E WITH ACUTE} - Universit\\N{LATIN SMALL LETTER A WITH DIAERESIS}tsmedizin Berlin'
b'The name of the hospital is Charit? - Universit?tsmedizin Berlin'
b'The name of the hospital is Charit&#233; - Universit&#228;tsmedizin Berlin'


In [39]:
# Usage of the ascii() function
names = ['Quddus', 'Gößmann', 'José']
print(ascii(names))

['Quddus', 'G\xf6\xdfmann', 'Jos\xe9\u611b']


In [38]:
# Usage of the ord() function
dragon = "竜"
print(f'The Unicode code point value for {dragon} is {ord(dragon)}')

The Unicode code point value for 愛 is 24859


#### Escape Sequences

In [41]:
# Usage of escape characters
print('Escaping a \' single quote character')
print("Escaping a \" double quote character")
print('Escaping a \\ backslash character')
print('Escaping a \
newline character')
print('Escaping an \a ASCII bell character')
print('Escaping an \b ASCII backspace character')
print('Escaping an \f ASCII formfeed character')
print('Escaping an \n ASCII linefeed character')
print('Escaping an \r ASCII carriage return character')
print('Escaping an \t ASCII horizontal tab character')
print('Escaping an \v ASCII vertical tab character')

# Usage of escape sequences
print('Escaping the \xf6 German umlaut o character')
print('Escaping the \N{MOUSE} Unicode mouse character')
print('Escaping the \u2708 Unicode airplane character')

Escaping a ' single quote character
Escaping a " double quote character
Escaping a \ backslash character
Escaping a newline character
Escaping an  ASCII bell character
Escaping an  ASCII backspace character
Escaping an  ASCII formfeed character
Escaping an 
 ASCII linefeed character
Escaping an  ASCII carriage return character
Escaping an 	 ASCII horizontal tab character
Escaping an  ASCII vertical tab character
Escaping the ö German umlaut o character
Escaping the 🐁 Unicode mouse character
Escaping the ✈ Unicode airplane character


#### Immutable Strings

In [43]:
# Try to change a character in a string
test_string = 'Hello World'
print(test_string[6])
test_string[6] = "Q"

W


TypeError: 'str' object does not support item assignment

In [78]:
# Apply the id() function to strings
my_first_string = 'abracadabra'
my_second_string = 'abracadabra'
print(f'id(my_first_string) = {id(my_first_string)}')
print(f'id(my_second_string) = {id(my_second_string)}')

# Apply the id() function to each character in a string
for idx in range(0, len(my_first_string)):
    print(f'{my_first_string[idx]} = {id(my_first_string[idx])}')

id(my_first_string) = 140669325644208
id(my_second_string) = 140669325644208
a = 140669646981680
b = 140669647076336
r = 140669647801520
a = 140669646981680
c = 140669647497712
a = 140669646981680
d = 140669647497456
a = 140669646981680
b = 140669647076336
r = 140669647801520
a = 140669646981680


In [59]:
# Strings are immutable, but variables can be changed to point to different things
my_string = 'I am a Data Scientist'
print(my_string)
my_string = 'I am a Software Engineer'
print(my_string)

I am a Data Scientist
I am a Software Engineer


#### Multi-Line Strings

In [67]:
my_multiline_string = '''Line 1\tEOL 
Line 2\t\tEOL
Line 3    EOL'''
print(my_multiline_string)

Line 1	EOL 
Line 2		EOL
Line 3    EOL


#### Copying and Cloning Strings

In [88]:
# Create two variables to point to the same string
my_first_string = 'abracadabra'
my_second_string = 'abracadabra'
print(f'id(my_first_string) = {id(my_first_string)}')
print(f'id(my_second_string) = {id(my_second_string)}')

id(my_first_string) = 140669325669872
id(my_second_string) = 140669325669872


#### String Comparisons

In [94]:
# Represent the German umlaut using two different code point sequences
umlaut_sequence_1 = '\u00F6'
umlaut_sequence_2 = '\u006F\u0308'
print(f'{umlaut_sequence_1} has length {len(umlaut_sequence_1)}')
print(f'{umlaut_sequence_2} has length {len(umlaut_sequence_2)}')

ö has length 1
ö has length 2


In [96]:
# Compare strings that contain different code point sequences
import unicodedata
print(umlaut_sequence_1 == umlaut_sequence_2)
print(unicodedata.normalize('NFD', umlaut_sequence_1) == unicodedata.normalize('NFD', umlaut_sequence_2))

False
True


#### Advanced String Slicing

In [105]:
# Create a string
my_string = 'abracadabra'

# Extract characters of a string that have even indexes using the stride size argument
print(my_string[::2])

# Use a negative stride size argument to extract in reverse order
print(my_string[::-2])

# Extract a defined subset of characters with a stride size of 2 within that subset
print(my_string[4:10:2])

# Extract characters with even indexes starting from the 5th character
print(my_string[4::2])

arcdba
abdcra
cdb
cdba


#### Common String Methods

In [120]:
# isupper()
print('MY NAME IS JILLUR'.isupper())

# islower()
print('my name is jillur Q'.islower())

# isalpha()
print('learning python'.isalpha())

# isalnum()
print('learning python 3'.isalnum())

# isdecimal()
print('01092020'.isdecimal())

# isspace()
print(' \t\n\t\v  '.isspace())

# istitle()
print('Jillur Quddus'.istitle())

# capitalize()
print('capitalize me'.capitalize())

True
False
False
False
True
True
True
Capitalize me


#### Common String Functions

In [124]:
# len()
print(len('abracadabra'))

# chr()
print(chr(9786))

# ord()
print(ord('愛'))

11
☺
24859


### Advanced Lists
#### Indexing