# Python Strings

A Python string, like 'Hello', stores text as *a sequence of individual characters*.  A string is simply *a piece of text*.

Strings are of type ```str```.  Create a string using single, double or triple (single or double) quotation marks:

* ```'a string'```   
* ```"a string"```   
* ```'''a string'''```   
* ```"""a string"""```   

In [1]:
# you can put any number of inverted comma as long as it is consistent '' ""
print ("Singapore")

print ( type("Singapore") )   # strings are of type 'str'

Singapore
<class 'str'>


Text is central to many computations -- URLs, chat messages, the underlying HTML code that makes up web pages, style sheets, data files, etc.

Python strings are written between single quotes, 'Singapore'.  Alternatively, they can also be written between double quotes, "Tokyo 2020 Olympics".

```a = 'Singapore'```  
```b = "Tokyo 2020 Olympics"```

Each character in a string is drawn from the Unicode character set, which includes the characters of just about every language on earth, plus many emojis.

```c = "\u674e\u5149\u8000"```   # "Lee Kuan Yew" in Chinese  
```d = "\u0420\u043e\u0441\u0441\u0438\u044f"```   # "Russia" in Cyrillic

A given string must be internally consistent.  This means that if a string starts with a single quote, it *must* end with a single quote.  If it starts with a double quote, it *must* end with a double quote.

Let's start by creating (or defining) a few strings.

In [1]:
# using single quotes ('')

a = ''   # this is called an "empty string" (a string with zero characters)
b = 'Q'   # a string can have just a single character
c = 'Singapore'
d = '  Singapore  '
e = 'Tokyo Olympics'
f = 'Wee Kim Wee School of Communication and Information (Wakawaski)'

print ( a )
print ( b )
print ( c )
print ( d )
print ( e )
print ( f )


Q
Singapore
  Singapore  
Tokyo Olympics
Wee Kim Wee School of Communication and Information (Wakawaski)


In [2]:
# using double quotes ("")

a = ""   # this is called an "empty string" (a string with zero characters)
b = "Q"   # a string can have just a single character
c = "Singapore"
d = "  Singapore  "
e = "Tokyo Olympics"
f = "Wee Kim Wee School of Communication and Information (Wakawaski)"

print ( a )
print ( b )
print ( c )
print ( d )
print ( e )
print ( f )


Q
Singapore
  Singapore  
Tokyo Olympics
Wee Kim Wee School of Communication and Information (Wakawaski)


#### Note:

I prefer double quotes.  This is because JSON (we will be using this file format later) uses double quotes exclusively.

## Using Triple Quotes

For a string that spans multiple lines, use triple quotes (either triple single quotes or triple double quotes).

```'''a long string'''```       (using triple single quotes)

```"""a long string"""```       (using triple double quotes)

Such strings are called multi-line strings.  An example of a multi-line string is given below:

In [3]:
# This is the full transcript of PM Lee's address on transitioning to phase 3, Covid-19 vaccination
# Source: https://www.straitstimes.com/singapore/pm-lees-remarks-on-transitioning-to-phase-3-and-the-covid-19-vaccine

speech = """My fellow Singaporeans, we are coming to a full year since our first Covid-19 case.

It has been a year of uncertainty, full of ups and downs, filled with anxiety and trepidation. But much has changed within the last few months.

In March and April, we peaked at over 1,000 cases a day. Now on most days, we have zero cases of local transmissions.

When the pandemic first started, we worried if there would be enough supplies in the supermarkets. Today, supermarket shelves are full, and shopping is calm and uneventful.

Parents were worried then whether their kids should go to school, but we kept the school year intact, CCAs have resumed, and graduating students have finished exams and are waiting for their results.

We will not forget the two months of circuit breaker in April and May. But today, life is a lot more normal. We go to work, dine out and meet friends, though in groups no more than five.

How did we bring things under control?

It took a tremendous effort, and some good luck. Our measures were hard for everyone, but they worked. Singaporeans showed resilience and took them in their stride.

Our economy took a big hit, but we did not let it crash. Despite the global economic dislocation, most of our workers kept their jobs.

Now, our defences against Covid-19 are much stronger.

We have steadily built up our testing capacities and procedures. We introduced rostered routine testing of higher risk groups. We started using antigen rapid tests to resume larger gatherings and events safely.

We also beefed up our contact tracing capabilities, for example, expanding our SafeEntry and TraceTogether programmes, and distributing TraceTogether tokens.

We got used to the inconvenient restrictions, and found ways to carry on with life. We looked after one another, reminding each other to adhere to safe distancing, to wear masks, to see a doctor if ill, and so on.

I am very grateful that Singaporeans have complied with the spirit, and not just the letter, of the rules. We stayed united, kept up our guard, and did not allow ourselves to become complacent over time.

With everyone's full support, our enhanced safeguards worked, and we could gradually ease our restrictions, and we can be proud of how far we have come.

Because of your efforts, we are now ready to progress to the next phase.

Phase 3 will begin in two weeks' time, on Dec 28, so we will end the year with some good news.

The ministerial task force will explain the details immediately after my broadcast.

We will ease capacity limits in public places like malls and attractions, and at places of worship. One significant change is to allow groups of up to eight to congregate, up from the current maximum of five.

So eight people can dine out together, or visit someone's home. This will make it easier to hold family get-togethers during the festive period.

Please understand that even as we enter phase 3, the battle is far from won.

The Covid-19 virus has not been eradicated. There is a long way to go. Around the world, the pandemic is still raging. Many countries are seeing second, third, or even fourth waves of infection, with record numbers of daily cases. International borders remain largely closed.

But trade and travel are our lifeblood, and the longer our own borders stay closed to travellers, the greater the risk of us permanently losing out as an international hub, and consequently hurting our livelihoods.

Therefore, our only option is to reopen our borders in a controlled and safe manner. As we do so, we will see more imported cases. And there will be some risk of these imported cases spreading to the community.

We have already had a few cases recently. An airport staff, who likely came into contact with infected passengers. A marine worker, who picked up the virus after boarding ships to do repair and resupply work.

This is a calculated risk we have to accept. But the Government will take every precaution, and do our best to prevent imported cases from triggering a new outbreak.

At the same time, Singaporeans must keep their guard up, because the virus is most likely still circulating silently within our community. Each of us needs to play our part.

By all means make use of the higher limits and reconnect with friends and family, but please do not abandon your mindset of watchfulness and caution.

This is absolutely not the time to relax, and let our guard down, or to hold a big party, imagining that the problem has disappeared.

Progressing from phase 2 to phase 3 is a calibrated, careful move.

We are easing the restrictions in a controlled way, so that we can keep the Covid-19 situation stable, and take more steps forward later. So it is vital that you stay cautious and vigilant, continue to cooperate with the Government, and comply with the rules and restrictions that will apply in phase 3.

How long will we have to keep this up for? It may be for quite a while, possibly a year or more.

One key factor is how soon Covid-19 vaccines become available to us.

The Government has been working quietly behind the scenes, since early in the pandemic, to secure access to vaccines. This was not a simple exercise.

More than 200 vaccine candidates were being developed, and not all would succeed. We started talking to the pharmaceutical companies early to understand the science, and identify the promising candidates and the vaccines likely to reach production sooner.

We set aside more than $1 billion. We placed multiple bets to sign advance purchase agreements and make early down payments for the most promising candidates, including with Moderna, Pfizer-BioNTech, and Sinovac.

We made arrangements with pharmaceutical companies to facilitate their clinical trials and drug development in Singapore, and attracted a few to establish vaccine manufacturing capabilities here.

We also supported local efforts to develop a vaccine. This gave our own scientists and researchers the opportunity to do cutting-edge work. It was also insurance, in case the global supply chain was disrupted.

This way, we built up a diversified portfolio of options to ensure that Singapore would be near the front of the queue for vaccines, and not last in line.

Securing early access to vaccines was a whole-of-government effort. Many agencies and public officers, led by the head of the civil service, were involved in this critical mission.

I commend them for their good work. They are among the legion of unsung heroes who have helped us get through this crisis.

As you would have read in the news, the first vaccines are now coming into production, and I am very happy to tell you that after studying the scientific evidence and clinical trial data, the Health Sciences Authority, HSA, has approved the Pfizer-BioNTech vaccine for pandemic use.

The first shipment should arrive by the end of this month, making Singapore one of the first countries to obtain this vaccine. We also expect other vaccines to arrive in Singapore in the coming months.

If all goes according to plan, we will have enough vaccines for everyone in Singapore by the third quarter of 2021.

The Ministry of Health has set up a committee of doctors and experts to recommend a vaccination strategy for us.

The committee has proposed that our entire adult population should be vaccinated, but to make vaccinations voluntary.

First priority will be given to those who are at greatest risk: healthcare workers and front-line personnel, as well as the elderly and vulnerable.

Thereafter, the committee proposes to progressively vaccinate the rest of the population, and to cover everyone who wants a vaccination by the end of next year.

The Government has accepted these recommendations. I have personal confidence in our experts.

My Cabinet colleagues and I, including the older ones, will be getting ourselves vaccinated early. This is to show you, especially seniors like me, that we believe the vaccines are safe.

We have decided to make vaccinations free for all Singaporeans, and for all long-term residents who are currently here.

So I strongly encourage you to get vaccinated, too, when the vaccine is offered to you. Because when you get yourself vaccinated, you are not just protecting yourself, you are also doing your part to protect others, especially your loved ones.

The more of us are vaccinated, the harder it will be for the virus to spread, and the safer we will all be as a society.

Vaccines will support our recovery in more ways than one.

As a global aviation hub, we play a crucial role transporting vaccines around the world.

Vaccines require cold chain management. An ordinary refrigerator is not good enough. The Pfizer vaccine needs to be stored at minus 70 deg C, colder than the Arctic in winter!

This requires infrastructure, high standards, skilled personnel, and good connectivity to many different countries all along the supply chain.

Fortunately, Singapore has a strong ecosystem for cargo handling.

Leading global logistics companies like DHL, UPS and FedEx are based here. SIA and Changi Airport's ground handling partners are certified by Iata (International Air Transport Association) to handle and transport pharmaceutical supplies.

We are now gearing ourselves up to handle large volumes of vaccine shipments into Singapore and through Singapore to help win the global fight against Covid-19.

We did not get here overnight. We have always planned ahead, systematically creating opportunities for ourselves. It took us years of investment and planning, building a business-friendly climate and expanding our air links around the world.

These long-term investments are now paying dividends.

During this immediate crisis, we have reacted quickly and comprehensively, marshalled resources to solve our problems, and stayed resilient.

Our situation is now stable, but only because everyone has worked so hard, and sacrificed so much.

Now that vaccines are becoming available, we can see light at the end of the tunnel.

As vaccinations become widespread not only in Singapore, but also in our region and the world, we can look forward to resuming more normal lives.

Let us keep up our efforts in this final stretch, to cross the finish line together, and complete our mission to defeat Covid-19.

Thank you."""

print ( speech )

My fellow Singaporeans, we are coming to a full year since our first Covid-19 case.

It has been a year of uncertainty, full of ups and downs, filled with anxiety and trepidation. But much has changed within the last few months.

In March and April, we peaked at over 1,000 cases a day. Now on most days, we have zero cases of local transmissions.

When the pandemic first started, we worried if there would be enough supplies in the supermarkets. Today, supermarket shelves are full, and shopping is calm and uneventful.

Parents were worried then whether their kids should go to school, but we kept the school year intact, CCAs have resumed, and graduating students have finished exams and are waiting for their results.

We will not forget the two months of circuit breaker in April and May. But today, life is a lot more normal. We go to work, dine out and meet friends, though in groups no more than five.

How did we bring things under control?

It took a tremendous effort, and some good luck.

## Unicode

Unicode is a set of characters containing many special characters.

For example, to print Greek letters or emojis, you can use Unicode, which is maintained by the Unicode Consortium.

Emoji are pictographs (pictorial symbols) that are typically presented in a colorful form and used inline in text.  They represent things such as faces, weather, vehicles and buildings, food and drink, animals and plants, or icons that represent emotions, feelings, or activities.  To the computer, each is simply another character, but people send each other billions of emoji everyday to express love, thanks, congratulations, or any number of a growing set of ideas.

For the complete Unicode character table, consult: [Unicode Character Table](https://unicode-table.com/)

For the latest edition of Unicode, consult: [The Unicode Consortium](https://www.unicode.org/)

The latest version of Unicode is version 14.0.0 (containing 144,697 characters), which was released on 14 September 2021.

In [5]:
import unicodedata              # the below examples are names of unicode, not the code itself so it doesn't start with \u
# Look up character by name, if u don't know the \u code
# Return as the logo, the character it self

print (unicodedata.lookup('HYPHEN'))
print (unicodedata.lookup('HIGH VOLTAGE SIGN'))
print (unicodedata.lookup('NO ENTRY'))

c = "\u674e\u5149\u8000" # "Lee Kuan Yew" in Chinese
d = "\u0420\u043e\u0441\u0441\u0438\u044f" # "Russia" in Cyrillic
print (c)
print (d)

‐
⚡
⛔
李光耀
Россия


In [8]:
import unicodedata  #opposite of above
# Return as name, not \u 
    
print (unicodedata.name(u'潭'))     #u here is u string
print (unicodedata.name(u'耀'))
print (unicodedata.name(u'|'))
print (unicodedata.name(u'~'))
print (unicodedata.name(u'{'))

CJK UNIFIED IDEOGRAPH-6F6D
CJK UNIFIED IDEOGRAPH-8000
VERTICAL LINE
TILDE
LEFT CURLY BRACKET


In [9]:
import unicodedata   #below are the 5 categories of unicode (there are a lot of categories), there is a total of 14k unicode

print (unicodedata.category(u'A'))
print (unicodedata.category(u'b'))
print (unicodedata.category(u'潭'))
print (unicodedata.category(u'ß'))
print (unicodedata.category(u':'))

# refer to https://www.fileformat.info/info/unicode/category/index.htm for details

Lu
Ll
Lo
Ll
Po


In [25]:
# These are Greek letters  
# difference between \U and \u is that \U can decode more than 4 hex digit, \u capped at 4

print ('\U00000030')   # for \U must have 8, if not just add 0 infront
print ('\u0030')       # for \u must have 4, if not just add 0 infront
print ('\U0001f340')
print ('\u1f340')      #the code is same as the previous one but it cant be run

OMEGA = "\u03A9"
DELTA = "\u0394"
sigma = "\u03C3"
epsilon = "\u03B5"
degrees = "\u00B0"
vector = "v = 6i\u0302 + 4j\u0302 - 2k\u0302"   # each \u0302 counts as 1

print ( OMEGA )
print ( DELTA )
print ( sigma )
print ( epsilon )
print ( degrees )
print ( vector )

print ()

# These are emoji
# refer to https://unicode.org/emoji/charts/full-emoji-list.html
# replace the "+" with "000"

grinning_face = "\U0001f600"                # from U+1F600
grinning_sqinting_face = "\U0001F606"       # from U+1F606
laughing_with_tears = "\U0001F923"          # from U+1F923
face_with_halo = "\U0001F607"               # from U+1F607
rolling_on_the_floor_laughing = "\U0001F923"  # from U+1F923

print ( grinning_face )
print ( grinning_sqinting_face )
print ( laughing_with_tears )
print ( face_with_halo )
print (rolling_on_the_floor_laughing )

0
0
🍀
ἴ0
Ω
Δ
σ
ε
°
v = 6î + 4ĵ - 2k̂

😀
😆
🤣
😇
🤣


In [30]:
print ("I hope you all are not \U0001F971!")   # https://unicode-table.com/en/1F971/

import unicodedata

print (unicodedata.lookup('LEFT CURLY BRACKET')) 
print (unicodedata.lookup('NO ENTRY'))

I hope you all are not 🥱!
{
⛔


### Exercise

In your own time, experiment with printing out Unicode characters!

Emojis can be fun to play around with!

## Unpacking Strings

This refers to the process of assigning each character of a string to separate variables.

The concept of unpacking an object is an important one.  It can also be used with lists and tuples.  We will unpack lists and tuples when we learn them.

Note that when the number of variables do not match, a ValueError is raised.

In [36]:
c1, c2, c3, c4, c5 = "Japan"   # "Japan" has five letters
# assigning each character as a variable, to unpack

print (len("Japan"))
print ( c1 )   # c1 will store the character "J"
print ( c2 )   # c2 will store the character "a"
print ( c3 )   # c3 will store the character "p"
print ( c4 )   # c4 will store the character "a"
print ( c5 )   # c5 will store the character "n"

5
J
a
p
a
n


In [33]:
c1, c2, c3, c4, c5, c6 = "Japan"   # ValueError - "Japan" has five letters, not six!

print ( c1 )
print ( c2 )
print ( c3 )
print ( c4 )
print ( c5 )
print ( c6 )

ValueError: not enough values to unpack (expected 6, got 5)

In [35]:
c1, c2, c3, c4 = "Japan"   # ValueError - "Japan" has five letters, not four!

print ( c1 )
print ( c2 )
print ( c3 )
print ( c4 )

ValueError: too many values to unpack (expected 4)

## String Length

The ```len()``` function returns the length of a string, the number of characters in it.

Note that a space is counted as a character.

It is valid to have a string of zero characters, written just as '' or "".  Such a string is called an "empty string".  The length of an "empty string" is zero.

The ```len()``` function cannot be applied to integers, floating-point values, and complex numbers.

In [39]:
a = ""   # this is an "empty string"
a1 = "  "  # 2 spaces
b = "Q"   # a string can have just a single character
c = "Singapore"
c1 = "Singapore "
d = "  Singapore  "
e = "Tokyo Olympics"
f = "Wee Kim Wee School of Communication and Information (Wakawaski)"

print ( len(a) )   # the length of an empty string is zero
print ( len(a1))   # 2 spaces are counted as 2, but 1 spaces is empty, 0
print ( len(b) )
print ( len(c) )   # the length of the string "Singapore" is 9
print ( len(c1))   # the space is counted as 1
print ( len(d) )   # the length of the string *includes* spaces
print ( len(e) )
print ( len(f) )

0
2
1
9
10
13
14
63


In [40]:
print ( len("Singapore") )

9


In [41]:
OMEGA = "\u03A9"   # Greek letters (and Unicode characters, in general), count as 1
DELTA = "\u0394"
sigma = "\u03C3"
epsilon = "\u03B5"
degrees = "\u00B0"
vector = "v = 6i\u0302 + 4j\u0302 - 2k\u0302"   # each \u0302 counts as 1

print ( len(OMEGA) )
print ( len(DELTA) )
print ( len(sigma) )
print ( len(epsilon) )
print ( len(degrees) )
print ( len(vector) )

1
1
1
1
1
19


In [42]:
print ( len("v = 6i\u0302 + 4j\u0302 - 2k\u0302") )

# make sure you know how the output (19) is arrived at

19


In [43]:
print ( len(speech) )

10290


In [44]:
count = 5

print ( len(count) )   # TypeError -- len() cannot be applied to integers (int)

TypeError: object of type 'int' has no len()

In [45]:
mathematical_constant = 3.1415

print ( len(mathematical_constant) )   # TypeError -- len() cannot be applied to an floating-point values (float)

TypeError: object of type 'float' has no len()

In [46]:
a_simple_complex_number = 3 + 4j

print ( len(a_simple_complex_number) )   # TypeError -- len() cannot be applied to complex numbers (complex)

TypeError: object of type 'complex' has no len()

#### Future Alert

The ```len()``` function is occurs very commonly.

We will encounter it again later when discussing lists, dictionaries, tuples and sets.  Remember to look out for it!

In general, when you use ```len()```, think "how many elements".

## Intergers/Floating-point Values vs Strings

It is important to be able to distinguish between numbers (integer and floating-point) and strings.

The str() function is used to convert numbers (integer and floating-point) to strings.

In [47]:
print ( 3 + "a string")

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [50]:
a = 5   # an integer
b = 3.1415   # a floating-point value

c = str(a)   # convert the integer stored in variable a to a string and assign it to c
d = str(b)   # convert the floating-point value stored in variable b to a string and assign it to d

print (a, c, sep=" --- ")   # visually, a and c look exactly the same -- they are different in type
print (b, d, sep=" --- ")   # visually, b and d look exactly the same -- they are different in type

# Print (a,c) print tgt
# Print (a,c,sep = "==>") sep is to separate
print (a,c)  # by default separate with space
print ()
print ( type(a) )
print ( type(c) )
print ()
print ( type(b) )
print ( type(d) )

5 --- 5
3.1415 --- 3.1415
5 5

<class 'int'>
<class 'str'>

<class 'float'>
<class 'str'>


## String Indexing

String indexing is used to access a single character in a string.

Strings represent meaning, and so they are ordered.

"Singapore" has meaning, while ..
"poreaSing" has no meaning.

Because strings are ordered, the individual characters can be accessed using indexing.  An index is simply an integer that indicates position.

The individual characters in a string are accessed with zero-based indexing with square brackets, so the first character is at index 0, the next character at index 1, and the last character is at index (lenght_of_string - 1).

String indexing is useful when referring to or retrieving *a single character* in the string.

Example:

```language = "PYTHON"```

```language[0] = "P"```      # zero-based indexing means that the first character in the string has an index of 0   
```language[1] = "Y"```  
```language[2] = "T"```  
```language[3] = "H"```  
```language[4] = "O"```  
```language[5] = "N"```      # an index of 5 is used to access the last character in the string as the length of the string is 6

In [3]:
language = "COBRA "

print ( language[0] )   # "C"
print ( language[1] )   # "O"
print ( language[2] )   # "B"
print ( language[3] )   # "R"
print ( language[4] )   # "A"
print ( language[5] )   # IndexError # in this case i put a space

C
O
B
R
A
 


### Exercise

Print the following letters from the string "Tokyo 2020 Olympic Games":

* the letter y in the word "Tokyo"
* the first 0 in the word "2020"
* the letter p in the word "Olympic"
* the letter m in the word "Games"

In [4]:
event = "Tokyo 2020 Olympic Games"

print ( event[3] )
print ( event[7] )
print ( event[15] )
print ( event[21] )
print ()   # prints an empty line
print ( len(event) )

y
0
p
m

24


In [5]:
print ( "Tokyo 2020 Olympic Games"[3] )
print ( "Tokyo 2020 Olympic Games"[7] )
print ( "Tokyo 2020 Olympic Games"[15] )
print ( "Tokyo 2020 Olympic Games"[21] )
print ()   # prints an empty line
print ( len("Tokyo 2020 Olympic Games") )

y
0
p
m

24


When you index a character "beyond" the string, you get an IndexError.

In [6]:
event = "Tokyo 2020 Olympic Games"   # from the previous cell, we know event has only got 24 characters

print ( event[30] )   # IndexError -- the word does not have so many characters!

IndexError: string index out of range

## String Negative Indexing

In negative indexing, the characters are indexed backward from -1.

* the last character in the string is indexed -1
* the second last character in the string is indexed -2
* the third last character in the string is indexed -3
* and so on ...

Like string indexing, string negative indexing is useful when referring to *a single character* in the string.

In [7]:
language = "PYTHON"

print ( language[-1] )   # "N"  take note not -0
print ( language[-2] )   # "O"
print ( language[-3] )   # "H"
print ( language[-4] )   # "T"
print ( language[-5] )   # "Y"
print ( language[-6] )   # "P"
print ( language[-7] )   # IndexError

N
O
H
T
Y
P


IndexError: string index out of range

In [8]:
language = "PYTHON"

print ( language[-20] )   # IndexError -- you have "overshot"

IndexError: string index out of range

### Exercise

Use negative indexing to print the following letters from the string "Quacquarelli Symonds (QS) University Ranking":

* the letter q in the word "Quacquarelli"
* the letter S in the word "Symonds"
* the letter v in the word "University"
* the letter i in the word "Ranking"

In [10]:
# Solution to Exercise

ranking = "Quacquarelli Symonds (QS) University Ranking"

print (ranking[4])
print (ranking[-31])
print (ranking[-15])
print (ranking[-3])

q
S
v
i


# INDEXING vs SLICING

## String Slicing

String slicing is a powerful way to refer to a *part of a string* instead of just a *single character*:

* s[start:end] returns a substring from s beginning at the start index, running up to but *not including the end index*
* If the start index is omitted, starts from the beginning of the string
* If the end index is omitted, runs through the end of the string
* If the start index is equal to the end index, the slice is the empty string.

Note: Make sure you are able to distinguish between string indexing and string slicing.

* string indexing: a single character is being accessed
* string slicing: a contiguous part of the string is being accessed

There is another key point you have to note.

* string indexing: you cannot go "beyond" the length of the string
* string slicing: you can go "beyond" the length of the string

In [12]:
school = "Nanyang Technological University (NTU)"

print ( "There are", len(school), "characters in the string." )
print ()
print ( school[8:21] )   # the 21 position not included
print ( school[22:32] )
print ( school[33:] )
print ( "Nanyang Technological University (NTU)"[:7] )
print ( school[50:] )   # prints empty line
print ( school[22:10000] )   # no problems going beyond the end of the string -- prints till the end of the string
print ()   # prints empty line
print ( school[-30:-17] )   # string slicing using negative indexing

There are 38 characters in the string.

Technological
University
(NTU)
Nanyang

University (NTU)

Technological


In [14]:
print ( school[:] )
print (school)

Nanyang Technological University (NTU)
Nanyang Technological University (NTU)


### Note that:

```s[:i] + s[i:] = s```

In [15]:
school = "Massachusetts Institute of Technology (MIT)"

print ( school[:16] )
print ( school[16:] )
print ()
print ( school[:16] + school[16:] )   # remember: s[:i] + s[i:] = s

Massachusetts In
stitute of Technology (MIT)

Massachusetts Institute of Technology (MIT)


In [16]:
school = "Massachusetts Institute of Technology (MIT)"

print ( school[:20] )
print ( school[20:] )
print ()
print ( school[:20] + school[20:] )   # remember: s[:i] + s[i:] = s

Massachusetts Instit
ute of Technology (MIT)

Massachusetts Institute of Technology (MIT)


In [17]:
school = "Massachusetts Institute of Technology (MIT)"

print ( len(school) )
print ()
print ( school[:50] )
print ( school[50:] )
print ()
print ( school[:50] + school[50:] )   # 50 is greater than the length of school (43) -- no problems

43

Massachusetts Institute of Technology (MIT)


Massachusetts Institute of Technology (MIT)


## Extracting Every xth Character

In [22]:
# Extracting every second character

quotation = "Thinking is the hardest work there is, which is probably the reason so few engage in it."

print ( quotation[::2] )  # first colon is whole string, second colon refers to the # of character u wan to skikp
print ( quotation[::1] )  # because the last number is not counted, so ::1 means no space
print ( quotation[::0] )

Tikn stehretwr hr s hc spoal h esns e naei t
Thinking is the hardest work there is, which is probably the reason so few engage in it.


ValueError: slice step cannot be zero

In [23]:
# Extracting every third character, from the sixth character onwards -- the sixth character = quotation[5]

quotation = "Thinking is the hardest work there is, which is probably the reason so few engage in it."

print ( quotation[5::3] )

i  eae rtri i  ob eeosf gent


## Reversing a String

A string can be reversed by using ```s[::-1]```.

In [20]:
quotation = "Thinking is the hardest work there is, which is probably the reason so few engage in it."

print ( quotation[::-1] )

.ti ni egagne wef os nosaer eht ylbaborp si hcihw ,si ereht krow tsedrah eht si gniknihT


In [21]:
quotation = "Thinking is the hardest work there is, which is probably the reason so few engage in it."

print ( quotation[::-1][::2] )

.in ggewfo oaretybbr ihiw,ieetko sda h iginh


In [24]:
print ( ".ti ni egagne wef os nosaer eht ylbaborp si hcihw ,si ereht krow tsedrah eht si gniknihT"[::2] )

.in ggewfo oaretybbr ihiw,ieetko sda h iginh


#### How is this possible?

In [25]:
print ( type(quotation[::-1]) )

<class 'str'>


## String Concatenation

The + operator combines (or "concatenates") two or more strings to make a bigger string.

This creates new strings to represent the result, leaving the original strings unchanged.

Think of concatenation as gluing two (or more) strings together.

In [26]:
print (3 + 5)   # with numbers, the operator "+" adds them together

8


In [27]:
s1 = "Multi-Ministry"   # a string literal
s2 = "Taskforce"   # a string literal
s3 = "(MTF)"   # a string literal

s4 = s1 + s2 + s3   # with strings, the operator "+" concatenates the strings together
print ( s4 )

print ( "Multi-Ministry" + "Taskforce" + "(MTF)" ) 


s5 = s1 + " " + s2 + " " + s3
print ( s5 )

Multi-MinistryTaskforce(MTF)
Multi-MinistryTaskforce(MTF)
Multi-Ministry Taskforce (MTF)


### Types Cannot be Mixed

Concatenation (+) only works with two or more strings.

Concatenating a string with any other data type (int, float or complex) will result in an error.

The str() function is used to convert intergers or floating-point numbers to a string.  This is called "type casting".

In [28]:
s6 = 2 + "rabbits"   # TypeError -- 2 is an integer, and "rabbits" is a string -- an incompatible mixture of types!

TypeError: unsupported operand type(s) for +: 'int' and 'str'

In [29]:
s7 = str(2) + " rabbits"   # the integer 2 has been type casted into the string "2"

print (s7)

2 rabbits


In [30]:
"Final Score: " + 6   # TypeError -- strings can concatenate only with other strings!!!

TypeError: can only concatenate str (not "int") to str

In [31]:
"Final Score: " + str(6)   # type cast the integer 6 to the string "6", then concatenate the two strings

'Final Score: 6'

### A Relatively Unknown Fact

In Python, putting strings literals next to each other automatically concatenates them!

This only works with string literal.  It does not work with variables containing strings literals.

In [37]:
odd_string = "Urdu" "Tamil" "Hindi"   # note that there are no commas between the strings!
odd_string1 = "Urdu", "Tamil", "Hindi"
odd_string2 = "Urdu"+"Tamil"+"Hindi"  # if all of them are string, can omit comma
print (odd_string)
print (odd_string1)
print (odd_string2)

UrduTamilHindi
('Urdu', 'Tamil', 'Hindi')
UrduTamilHindi


In [34]:
language = "Tamil"

odd_string = "Urdu" language "Hindi"

# this does not work because language is not a string literal
# language is a variable!
# a SyntaxError results

print (odd_string)

SyntaxError: invalid syntax (869232627.py, line 3)

In [38]:
language = "Tamil"

odd_string = "Urdu" + language + "Hindi"     # this is okay!
# if some of them are of different type, must use plus sign

print (odd_string)

UrduTamilHindi


## String Repetition

The multiplication operator (\*) repeats string a given number of times.

In [39]:
print (3 * 5)   # with numbers, the operator "*" multiplies the numbers together

15


In [40]:
print ("Ho! " * 3)   # with strings, the operator "*" repeats the string

Ho! Ho! Ho! 


In [41]:
print ("Ho! " * 4.)   # TypeError -- cannot multiply a string by a floating-point number

TypeError: can't multiply sequence by non-int of type 'float'

In [42]:
print ("Ho! " - "!")   # TypeError -- cannot use the minus operator on strings to remove the "!" from "Ho!"

TypeError: unsupported operand type(s) for -: 'str' and 'str'

In [44]:
# combining string concatenation and string repetition

print ("Santa Clause said: " + "Ho! " * 3)  # minus sign unlike plus sign is not allow to use with string

Santa Clause said: Ho! Ho! Ho! 


## The in and not in Operators (Membership Test)

The in operator checks if something appears anywhere in a string.

It returns either Boolean values (either True or False).

Note 1: "in" is known as a *membership operator*

Note 2: The "partner" operator of "in" is "not in"

The output of a membership test is a boolean value (True or False).

In [45]:
"p" in "Singapore"   # there is a "p" in "Singapore"

True

In [46]:
"P" in "Singapore"   # there is no "P" in "Singapore"   # little p ("p") is not the same as capital p ("P")

False

In [47]:
"Sing" in "Singapore"   # there is a "Sing" in "Singapore"

True

In [48]:
"sing" in "Singapore"   # there is no "sing" in "Singapore"

False

In [49]:
"pine" in "Philippines"   # there is a "pine" in "Philippines"

True

In [50]:
"oak" in "Philippines"   # there is no "oak" in "Philippines"

False

An empty string can always be found in a string.

In [51]:
"" in "Singapore"   # "" is an empty string -- an empty string is a member of any string!

True

In [52]:
"" in ""   # "" is an empty string -- an empty string is a member of any string, including empty strings!

True

In [53]:
"COBOL" not in "Java, C, C++, C#, Pascal, Fortran, BASIC"   # True, "COBOL" is not in the string

True

## Character Class Tests

The characters that make up a string can be divided into several categories or "character classes".

Many times, you will need to test if a character belongs to a class (or category of characters).

There are several classes:

* alphabetic characters - abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ

* lowercase alphabetic characters - abcdefghijklmnopqrstuvwxyz

* uppercase alphabetic charcaters - ABCDEFGHIJKLMNOPQRSTUVWXYZ

* digit characters - 0123456789

* space characters - space ' ', newline '\n', tab '\t'

The test methods return ```True``` if *all* the characters are in the given class, and ```False``` if not.

If a single character does not belong to the class, it will return ```False```.

In [27]:
print ('a'.isalpha())             # does it belong to the list of alphabetic characters?
# isalpha() works for alphabetical both upper and lower
print ('A'.isalpha())

True
True


In [2]:
'abc'.isalpha()           # do all three characters belong to the list of alphabetic characters?

True

In [3]:
'$abc'.isalpha()          # "$" does not belong to alphabetic characters

False

In [4]:
string = '$abc'

print (string.isalpha())

False


In [5]:
'abc'.islower()           # do all three characters belong to the list of lowercase characters?

True

In [6]:
'Abc'.islower()           # "A" does not belong to alphabetic characters

False

In [7]:
'ABC'.isupper()           # do all three characters belong to the list of uppercase characters?

True

In [28]:
"\u03A3".isalpha()        # 03A3 refers to the (Greek) capital sigma sign
# the greek alpha is a alphabetic character, not separate as u03A3, there is a \u

True

In [9]:
"6".isdigit()

True

In [10]:
"612".isdigit()

True

In [29]:
"6.12".isdigit()          # isdigit() is super strict and does not even allow the decimal point
# the decimal point make is not a digit

False

In [34]:
"6".isdigit()

True

In [35]:
"12.23".isdecimal()

False

In [37]:
"12.23".isnumeric()

False

In [31]:
"12345".isdecimal()

True

In [32]:
"12345".isnumeric()

True

In [33]:
"a12".islower()

True

In [16]:
"AIDS".isupper()

True

In [38]:
"National University of Singapore".istitle()   # False because of the letter "o" in the word "of"
# beware the o if Of, must be capital to make it true

False

In [18]:
"National University Of Singapore".istitle()

True

In [19]:
" ".isspace()   # obviously a space

True

In [39]:
"          \t\n\n\n\n\t".isspace()   # newline and tab characters are also spaces
# the \ is like the ending then t is tab, n is newline, all are spaces

True

In [41]:
"".isspace()   # an empty string does not contain any characters (space is a character)
# empty string, no space is not a space

False

In [43]:
"abc".isalnum() # works for either one

True

In [22]:
"123abc".isalnum()   # alphabetic or numeric

True

In [44]:
"123 abc".isalnum()   # alphabetic or numeric # no space, because space is not alphabetic nor numeric?

False

In [24]:
"123#abc".isalnum()   # the "#" character is neither alphabetic nor numeric

False

## Summary

In this section, we covered the following string methods.

* ```isalpha()```
* ```islower()```
* ```isupper()```
* ```isdecimal()```
* ```isdigit()```
* ```istitle()```
* ```isspace()```
* ```isalnum()```

They test whether the characters that make up a string belong to a specific class of characters.

In [45]:
# Remember this from the first session?
# These are words that cannot be used as variable names -- they are known as Python keywords

import keyword

print ( keyword.kwlist )

['False', 'None', 'True', 'and', 'as', 'assert', 'async', 'await', 'break', 'class', 'continue', 'def', 'del', 'elif', 'else', 'except', 'finally', 'for', 'from', 'global', 'if', 'import', 'in', 'is', 'lambda', 'nonlocal', 'not', 'or', 'pass', 'raise', 'return', 'try', 'while', 'with', 'yield']


In [48]:
# A test can be performed if a string is one of the Python keywords.
# Remember that Python keywords cannot be used as variable names.

import keyword

s = input("Enter a string: ")

print ( s )
print ( type(s) )

iskey = keyword.iskeyword(s) # function the iskeyword() module on s, return  in boolean value
print (s,'is a keyword:', iskey)

Enter a string: False
False
<class 'str'>
False is a keyword: True


## Changing Case

The ```lower()``` method returns a new version of the string where each character is converted to its lowercase form, so 'A' becomes 'a', 'B' becomes 'b', etc.  Characters like '$' remain unchanged.  The original string is unchanged.  The new string has to be stored in a variable.

Conversely, the ```upper()``` method returns an uppercase version of the string.

In [49]:
organisation = "Association of Southeast Asian Nations (ASEAN)"

print ( organisation.lower() )   # variable + module fundtion of variable
print ("Association of Southeast Asian Nations (ASEAN)".lower() )


print ()
print ( organisation.upper() )
print ( organisation.lower()+"; "+organisation.upper() )

association of southeast asian nations (asean)
association of southeast asian nations (asean)

ASSOCIATION OF SOUTHEAST ASIAN NATIONS (ASEAN)
association of southeast asian nations (asean); ASSOCIATION OF SOUTHEAST ASIAN NATIONS (ASEAN)


If you need to perform further operations, then the new string has to be stored in a variable.

In [50]:
organisation = "Association of Southeast Asian Nations (ASEAN)"

s1 = organisation.lower()
s2 = organisation.upper()

print ( s1 )
print ( s2 )

big_s = s1 + s2      # string concatenation

print ( big_s )

# after this, you can use the variables s1 and s2 to perform other things ...

association of southeast asian nations (asean)
ASSOCIATION OF SOUTHEAST ASIAN NATIONS (ASEAN)
association of southeast asian nations (asean)ASSOCIATION OF SOUTHEAST ASIAN NATIONS (ASEAN)


In [51]:
sentence = "I'M FEELING A LITTLE HUNGRY RIGHT NOW."

print ( sentence.capitalize() )   # .capitalize() is to make the first word capital

I'm feeling a little hungry right now.


In [54]:
university_name = "nanyang technological university"

print ( university_name.title() )  # .title() is to make the beginning of each word capital

Nanyang Technological University


In [53]:
sentence = "This bag costs S$123.45"

print ( sentence.upper() )

THIS BAG COSTS S$123.45


Since the ```upper()```, ```lower()```, ```title()``` and ```capitalize()``` string methods return strings, you can call string methods on those returned string values as well.  Expressions that do this look like they have been *chained* together.

In [59]:
print ("wkwsci".upper())
print ("wkwsci".upper().lower())
print ("wkwsci".upper().lower().upper())
print ("wkwsci".upper().lower().upper().title())  #function string-> function sting -->...

WKWSCI
wkwsci
WKWSCI
Wkwsci


#### How is this possible?

In [60]:
print ( type("wkwsci".upper()) )
print ( type("wkwsci".upper().lower()) )
print ( type("wkwsci".upper().lower().upper()) )
print ( type("wkwsci".upper().lower().upper().title()) )

<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>


#### Rarer String Methods: ```.swapcase()``` and ```.casefold()```

In [67]:
programming_language = "Python"

print ( programming_language.swapcase() )  # not necessary to define "python", but more easy to see

print ("asdPPaf".swapcase())
#.swapcase()must come at the back

pYTHON
ASDppAF


In [70]:
alphabet = 'aßcde'
print ( alphabet.casefold() )

print ()

alphabet = 'außen'
print ( alphabet.lower() )

print ()
swapcase(sdf)

asscde

außen



NameError: name 'swapcase' is not defined

## Summary

The methods in this section returns a new version of the string with specific characters changed.

* ```lower()```
* ```upper()```
* ```title()```
* ```capitalize()```

## The ```.startswith()``` and ```.endswith()``` String Methods

These are convenient functions that return either True or False (Boolean values) depending on what appears at one end of a string.

One example is when you need to check for a particular file type, e.g., a file extension with '.jpeg'.

* ```s.startswith(x)``` -- True if string s starts with string x
* ```s.endswith(x)``` -- True if string s end with string x

These methods return a boolean value (either True or False).

In [88]:
print('GraduationFamilyPortrait.jpeg'.startswith('Graduation'))
# .startswith（）return boolean value

print("Graduation FamilyPortrait.jpeg".startswith("Family"))

True
False


In [72]:
'GraduationFamilyPortrait.jpeg'.startswith('Family', 10)
#the 10 is starts with index 10, starts with family

True

In [73]:
'GraduationFamilyPortrait.jpeg'.startswith('Portrait', 16)

True

In [74]:
'GraduationFamilyPortrait.jpeg'.startswith('Wedding')

False

In [75]:
speech.startswith('Wedding')

NameError: name 'speech' is not defined

### Testing for a Specific Filetype Using endswith

The .endswith() string method can be used to test for file types.

In [76]:
'GraduationFamilyPortrait.jpeg'.endswith('.jpeg')   # testing for jpeg files

True

In [77]:
'GraduationFamilyPortrait.jpeg'.endswith('.pdf')   # testing for pdf files

False

In [78]:
'GraduationFamilyPortrait.JPEG'.endswith('.jpeg')   # case-sensitive -- "JPEG" is not the same as "jpeg"
# string, must be exact, if not they are different thing

False

In [89]:
'GraduationFamilyPortrait.JPEG'.lower()

'graduationfamilyportrait.jpeg'

In [90]:
'GraduationFamilyPortrait.JPEG'.lower().endswith('.jpeg')   # case-sensitive -- this is the solution
# lower the restriction on case sensitivity. you lower then, ends with ".jpeg"
# or upper then, ends with .JPEG

True

In [91]:
'GraduationFamilyPortrait.JPEG'.lower().endswith('peg')  # does not matter how many character you enter

True

In [80]:
'GraduationFamilyPortrait.jpeg'.endswith('.jpeg', 20) # 20 is the beginning to check letter

True

In [81]:
'GraduationFamilyPortrait.jpeg'.endswith('Portrait', 10, 24) # check up to  23

True

## The ```.removeprefix()``` and ```.removesuffix()``` String Methods

These string methods are used to remove the prefix and suffix from strings respectively.

In [93]:
"abcabcabcabcabcWee Kim Wee School of Communication and Informationabcabcabcabcabc".removeprefix("abcabcabc")
# remove the front

'abcabcWee Kim Wee School of Communication and Informationabcabcabcabcabc'

In [83]:
"abcabcabcabcabcWee Kim Wee School of Communication and Informationabcabcabcabcabc".removesuffix("abc")
# remove the back

'abcabcabcabcabcWee Kim Wee School of Communication and Informationabcabcabcabc'

## The String ```.find()``` Method

The string method ```find(x)``` searches a string from left to right, returning the int index where string x appears, or -1 if not found.

Use ```find()``` to compute the index where a substring first appears.

In [94]:
entity = "Association of Southeast Asian Nations (ASEAN)"

print( entity.find('of') )
print( entity.find('Asian') )
print( entity.find('NATIONS') )       # string 'NATIONS' cannot be found (a value of -1 will be returned)
print( entity.find('butbut') ) 

#.find() / "string".find(), give the index of the first word found
#case sentsitive, -1 if not found

12
25
-1
-1


Use rfind() to compute the index where a substring first appears from the end of the string.

In [95]:
entity = "Association of Southeast Asian and Northeast Asian Nations (ASEAN)"   # the word "Asian" appears twice!

print( entity.find('Asian') )         # find --> search from front
print( entity.rfind('Asian') )        # rfind --> search from the back, but report the index counting starts from the front
print( entity.rfind('NATIONS') )      # string 'NATIONS' cannot be found

25
45
-1


## Strip Whitespace

The string methods ```lstrip()```, ```rstrip()``` and ```strip()``` returns a version of the string with the whitespace characters (space, tab, newline) from the very start and very end of the string all removed.

It can be used to clean up strings parsed out of a file or read from a user in a dialog box.

In [96]:
name = "   Wee Kim Wee School of        Communication and Information (WKWSCI)                    "

print ("Before:")
print ( "*", name, "*" )    # by default they are separate with an empty space
print ( len(name) )
print ()
print ( "After:" )
clean_name = name.lstrip()  # important, l in lstrip stands for left
                            # "string".lstrip（）--> cuts the white space in front
print ( "*", clean_name, "*" )   # notice that the whitespace in the middle of the string is left untouched
print ( len(clean_name) )

Before:
*    Wee Kim Wee School of        Communication and Information (WKWSCI)                     *
90

After:
* Wee Kim Wee School of        Communication and Information (WKWSCI)                     *
87


In [97]:
name = "   Wee Kim Wee School of        Communication and Information (WKWSCI)                    "

print ("Before:")
print ( "*", name, "*", sep = "")     # to see clearer, u set separator as empty space, result same as above
print ( len(name) )
print ()
print ( "After:" )
clean_name = name.lstrip()
print ( "*", clean_name, "*", sep = "")   # notice that the whitespace in the middle of the string is left untouched
print ( len(clean_name) )

Before:
*   Wee Kim Wee School of        Communication and Information (WKWSCI)                    *
90

After:
*Wee Kim Wee School of        Communication and Information (WKWSCI)                    *
87


In [101]:
name = "   Wee Kim Wee School of        Communication and Information (WKWSCI)                    "

print ("Before:")
print ( "*", name, "*", sep = "" )
print ( len(name) )
print ()
print ( "After:" )                                # specifically from right
clean_name = name.rstrip()
print ( "*", clean_name, "*", sep = "" )                # notice that the whitespace in the middle of the string is left untouched
print ( len(clean_name) )

Before:
*   Wee Kim Wee School of        Communication and Information (WKWSCI)                    *
90

After:
*   Wee Kim Wee School of        Communication and Information (WKWSCI)*
70


In [102]:
name = "   Wee Kim Wee School of        Communication and Information (WKWSCI)                    "

print ("Before:")
print ( "*", name, "*", sep = "" )
print ( len(name) )
print ()
print ( "After:" )
clean_name = name.strip()                          #.strip() applies to both side
print ( "*", clean_name, "*", sep = "" )                # notice that the whitespace in the middle of the string is left untouched
print ( len(clean_name) )

Before:
*   Wee Kim Wee School of        Communication and Information (WKWSCI)                    *
90

After:
*Wee Kim Wee School of        Communication and Information (WKWSCI)*
67


#### Non-space Strips

In [100]:
name = "abcbcacbaWee Kim Wee School of Communication and Informationabcbcacba"

print ("Before:")
print ( "*", name, "*", sep = "" )
print ( len(name) )
print ()
print ( "After:" )
clean_name = name.strip()
print ( "*", clean_name, "*", sep = "" )   # notice that the whitespace in the middle of the string is left untouched
print ( len(clean_name) )
print ()
print ( "After:" )
clean_name = name.lstrip("abc")            ## whats differnece between remove prefix and suffix, ans these only remove one, strip remove all similar
print ( "*", clean_name, "*", sep = "" )   # notice that the whitespace in the middle of the string is left untouched
print ( len(clean_name) )
print ()
print ( "After:" )
clean_name = name.rstrip("abc")
print ( "*", clean_name, "*", sep = "" )                # notice that the whitespace in the middle of the string is left untouched
print ( len(clean_name) )
print ()
print ( "After:" )
clean_name = name.strip("abc")
print ( "*", clean_name, "*", sep = "" )                # notice that the whitespace in the middle of the string is left untouched
print ( len(clean_name) )

Before:
*abcbcacbaWee Kim Wee School of Communication and Informationabcbcacba*
69

After:
*abcbcacbaWee Kim Wee School of Communication and Informationabcbcacba*
69

After:
*Wee Kim Wee School of Communication and Informationabcbcacba*
60

After:
*abcbcacbaWee Kim Wee School of Communication and Information*
60

After:
*Wee Kim Wee School of Communication and Information*
51


## The String ```.replace()``` Method

The string method ```replace()``` returns a version of the string where *all* occurrences of old have been replaced by new.

The form of the method is:

```string.replace(old, new)```

It does not pay attention to word boundaries, but just replaces every instance of old in the string with new.

Replacing with the empty string effectively deletes the matching strings.

In [103]:
movie_title = "Man of Wrath"

print ( movie_title.replace('of','in') )

print ( movie_title )       # no change in original string

Man in Wrath
Man of Wrath


In [104]:
movie_title = "Man of Wrath"

new_movie_title = movie_title.replace('of','in')
print ( new_movie_title )

# you can do something with the new_movie_title here ...

Man in Wrath


In [105]:
movie_title = "Man of Wrath of Singapore"

print ( movie_title.replace('of','in') )            # applies to all of the words, every of

print ( movie_title )

Man in Wrath in Singapore
Man of Wrath of Singapore


## Escape Characters

A backslash (```\```) in a string literal in your code "escapes" a special char we wish to include in the string, such as a quote or \n newline. Common backslash escapes:

|Character|Meaning|
|-|-|
|```\'```|single quote|
|```\"```|double quote|
|```\\```  |backslash|
|```\n```|newline|
|```\t```|tab|

In [106]:
multiline_string = "\nThe First Line\nThe Second Line\nThe Third Line\n"

print ( multiline_string )


The First Line
The Second Line
The Third Line



In [111]:
multiline_string = """The First Line
The Second Line
The Third Line"""

# must be 3 inverted comma, or else cannot enter in 3 lines

print ( multiline_string )

SyntaxError: unterminated string literal (detected at line 1) (831412400.py, line 1)

In [112]:
string_with_tabs = "\nThe First Column\t\tThe Second Column\t\tThe Third Column\n\n\n\n\n"   # \t refers to a tab

print ( string_with_tabs )       # two tabs


The First Column		The Second Column		The Third Column







In [121]:
s1 = "This isn't me."
s2 = "He said, \"This isn't me.\""      # \ is escape, so you escape then quotation mark
s3 = 'He said, "This isn\'t me."'       # you have two inverted commas and one inverted commas to separate
s4 = "John o' Groats"                 # the most northerly point in Britain
s5 = "He said,""This isn't me.""" 
s6 = 'He said, "This isn't me.'"'

print ( s1 )
print ( s2 )
print ( s3 )
print ( s4 )
print ( s5 )
print ( s6 )

SyntaxError: invalid syntax (4202135178.py, line 6)

## Multiline String

To create a multiline string, you can also use triple quotes.  Either ''' or """ on both ends of the string.

Note that if you use triple quotes, you need *not* use the escape character for newline "\n").

In [110]:
nursery_rhyme_1 = '''Hey, diddle, diddle, the cat and the fiddle
The cow jumped over the moon
The little dog laughed to see such fun
And the dish ran away with the spoon'''

nursery_rhyme_2 = """Twinkle, twinkle, little star,
How I wonder what you are,
Up above the world so high,
Like a diamond in the sky,
Twinkle, twinkle, little star,
How I wonder what you are."""

print ( nursery_rhyme_1 )       # print the first nursery rhyme
print ()                        # print an empty line
print ( nursery_rhyme_2 )       # print the second nursery rhyme

Hey, diddle, diddle, the cat and the fiddle
The cow jumped over the moon
The little dog laughed to see such fun
And the dish ran away with the spoon

Twinkle, twinkle, little star,
How I wonder what you are,
Up above the world so high,
Like a diamond in the sky,
Twinkle, twinkle, little star,
How I wonder what you are.


## Splitting a String

str.split(',') is a string function which divides a string up into a list of string pieces based on a "separator" parameter that separates the pieces.

When you use the split() method, think "one string split into several substrings" in a list.

In [130]:
title = "Automate the Boring Stuff with Python"

print ( title.split()[3])   # returns the third splited word, starting with 0

Stuff


In [122]:
title = "Automate the Boring Stuff with Python"

print ( title.split() )   # results are substrings in a list

['Automate', 'the', 'Boring', 'Stuff', 'with', 'Python']


In [123]:
print ( "Automate the Boring Stuff with Python".split() ) # same applies to the full sentence entered

['Automate', 'the', 'Boring', 'Stuff', 'with', 'Python']


In [124]:
sentence = "Good morning, my name is Andrew, I am 25 years old, and I'm in a wonderful mood today."

print ( sentence.split(", ") )     # results are substrings in a list
# specify the split, on every ,
# bt default is uses spaces

['Good morning', 'my name is Andrew', 'I am 25 years old', "and I'm in a wonderful mood today."]


In [125]:
messy_list = "Kuala Lumpur#Ho Chi Minh City#Hanoi#Bangkok#Bandar Seri Begawan#Manila"

print ( messy_list.split("#") )    # results in a list

['Kuala Lumpur', 'Ho Chi Minh City', 'Hanoi', 'Bangkok', 'Bandar Seri Begawan', 'Manila']


In [131]:
cities = "London#Paris#Berlin#Moscow#Lisbon#Madrid#Vienna#Amsterdam"

# setting the maxsplit parameter to 3, will return a list with 4 elements!
print ( cities.split("#", 3) )     # results are substrings in a list, split 3 times, return 4 parts

# beware this is not index[3], it is to specify split into 4 parts, so it will only separate the starting 3 hashtag for you

['London', 'Paris', 'Berlin', 'Moscow#Lisbon#Madrid#Vienna#Amsterdam']


In [127]:
cities = "London#Paris#Berlin#Moscow#Lisbon#Madrid#Vienna#Amsterdam"

# setting the maxsplit parameter to 100 is okay because 100 specifies the *maximum* number of splits
print ( cities.split("#", 100) )     # results are substrings in a list

['London', 'Paris', 'Berlin', 'Moscow', 'Lisbon', 'Madrid', 'Vienna', 'Amsterdam']


In [134]:
title = "Automate the Boring Stuff with Python"

print ( title.split(" ", 3) )      # results are substrings in a list
print ( title.split())

['Automate', 'the', 'Boring', 'Stuff with Python']
['Automate', 'the', 'Boring', 'Stuff', 'with', 'Python']


In [129]:
messy_list = "Kuala Lumpur#Ho Chi Minh City#Hanoi#Bangkok#Bandar Seri Begawan#Manila"

print ( messy_list.split("Ho Chi Minh City") )    # results are substrings in a list
# using "Ho Chi Minh City" as a split, like comma and spaces

['Kuala Lumpur#', '#Hanoi#Bangkok#Bandar Seri Begawan#Manila']


## Joining Strings

```','.join(list)``` is a string function which is approximately the opposite of split -- take a list of strings parameter and forms it into a big string, using the string as a separator.

In the example below, the comma (```, ```) is used as the separator.

In [135]:
print (', '.join(["Rambutan", "Durian", "Jambu", "Langsat"]) )

print ()

print ( type ( ', '.join(["Rambutan", "Durian", "Jambu", "Langsat"]) ) )

Rambutan, Durian, Jambu, Langsat

<class 'str'>


## Raw Strings

A raw string completely ignores all escape characters, and prints any backslash that appears in the string.

It considers the backslash as part of the string and not as the start of an escape character.

Raw strings are helpful if you are typing string values that contain many backslashes, such as the location of a file or directory on a hard drive.

Raw strings are frequently used in two situations:

* when interacting with the operating system
* when using regular expressions

We will be revisiting them when we discuss the os, glob, shutil and pathlib modules.

In [136]:
print ( "Printed without the use of a raw string : c:\\Users\\ascklee\\Documents" ) #regular string
print ()
print ( r"Printed using a raw string              : c:\Users\ascklee\Documents" ) # raw string
# raw string is to make the back slashes appear as normal

Printed without the use of a raw string : c:\Users\ascklee\Documents

Printed using a raw string              : c:\Users\ascklee\Documents


## Counting Occurences of a Substring

The ```count()``` string method counts the number of occurences of a substring in a string.

In [137]:
text = "I simple love apples, and apples are my favorite fruit.  I don't quite like pineapples though."

print ( text.count("apple") )              # apples, apples and pineapples

3


In [138]:
# A famous speech by US President Abraham Lincoln

Gettysburg_Address = """Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure.
We are met on a great battle-field of that war.
We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live.
It is altogether fitting and proper that we should do this.
But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground.
The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract.
The world will little note, nor long remember what we say here, but it can never forget what they did here.
It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced.
It is rather for us to be here dedicated to the great task remaining before us -- that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government of the people, by the people, for the people, shall not perish from the earth."""

print ( Gettysburg_Address.count("the") )        # the number of times "the" appears in Lincoln's Gettysburg Address
print ( Gettysburg_Address.count("we") )         # the number of times "we" appears in Lincoln's Gettysburg Address
print ( Gettysburg_Address.count("here") )       # the number of times "here" appears in Lincoln's Gettysburg Address

20
9
8


## The splitlines() Method

The ```splitlines()``` method splits the string at line breaks and returns a list of lines in the string.

If keepends is provided and True, line breaks are also included in items of the list.  By default, the line breaks are not included.

In [139]:
shopping_list = 'instant noodles\npeanut butter\nmarmalade\nham\nbacon\ncorned beef'
# no gap between \n and the strings

print (shopping_list)
print ()

print ( shopping_list.splitlines() )   # line breaks are not included AKA ignore differnt lines, print as a list
print ( shopping_list.splitlines(keepends=True) )   # line breaks are included AKA print as a list but mention should be in different lines

groceries = 'instant noodles  peanut butter  marmalade  ham  bacon  corned beef'
print ( groceries.splitlines() )          # doesn't work because the original string is already in one line

instant noodles
peanut butter
marmalade
ham
bacon
corned beef

['instant noodles', 'peanut butter', 'marmalade', 'ham', 'bacon', 'corned beef']
['instant noodles\n', 'peanut butter\n', 'marmalade\n', 'ham\n', 'bacon\n', 'corned beef']
['instant noodles  peanut butter  marmalade  ham  bacon  corned beef']


In [140]:
Gettysburg_Address = """Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure.
We are met on a great battle-field of that war.
We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live.
It is altogether fitting and proper that we should do this.
But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground.
The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract.
The world will little note, nor long remember what we say here, but it can never forget what they did here.
It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced.
It is rather for us to be here dedicated to the great task remaining before us -- that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion -- that we here highly resolve that these dead shall not have died in vain -- that this nation, under God, shall have a new birth of freedom -- and that government of the people, by the people, for the people, shall not perish from the earth."""

text_in_separate_lines = Gettysburg_Address.splitlines(keepends=False)

print ( text_in_separate_lines )   # this produces a list of the separate lines

print ()

for line in text_in_separate_lines:
    print ( line )   # this for loop prints the individual lines (for loops will be discussed next week)

['Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal.', 'Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure.', 'We are met on a great battle-field of that war.', 'We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live.', 'It is altogether fitting and proper that we should do this.', 'But, in a larger sense, we can not dedicate -- we can not consecrate -- we can not hallow -- this ground.', 'The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract.', 'The world will little note, nor long remember what we say here, but it can never forget what they did here.', 'It is for us the living, rather, to be dedicated here to the unfinished work which the

## The ```print()``` Function

The ```print()``` function is one of the most frequently used functions in Python.

This function is a critical one because the results of your computations have to be displayed in some way to the user!

In this section, we will learn how to use this function effectively to structure your output.

There are several methods of printing.  We will start with the simplest method -- using comma separators.

In [148]:
duration = 10
profession = "librarian"
name = "Mr. Tan"

print (name, "has been a/an", profession, "for", duration, "years.")
# must use comma, replaced by space in resulting output
print (str(name) + "has been a/an" + str(profession) + "for", str(duration) + "years.")
# in this case it still works using + sign but limited to string variable only and not separated using spaces

# note that each expression is separated by a comma

Mr. Tan has been a/an librarian for 10 years.
Mr. Tanhas been a/anlibrarianfor 10years.


In [142]:
language = "Malay"
greeting = "'selamat pagi'"

print ("In", language, "use", greeting, "to greet others in the morning.")

# note that each expression is separated by a comma

In Malay use 'selamat pagi' to greet others in the morning.


In [143]:
item = "book"
price = 12.34

print ("This", item, "costs", price, "Singapore dollars.")

# note that each expression is separated by a comma

This book costs 12.34 Singapore dollars.


In [144]:
first_number = 5
second_number = 6

print (first_number, "times", second_number, "is", first_number * second_number)

# print() prints a bunch of expressions which are separated by comma

5 times 6 is 30


### Printing Using the String Modulo Operator (```%```)

Another old method of printing is to use ```%``` (the *string modulo operator*).

Format:

```print (string % variables)```

In [145]:
print (27 % 5)   # in this context, the "%" operator is the modulus (remainder) operator

2


In [162]:
quantity = 10
item = "eggs"

print ("I ordered %d %s." % (quantity, item))      # d - decimal number; s - string
print ("I ordered %2d %s." % (quantity, item))      # d - decimal number; s - string
print ("I ordered %5d %s." % (quantity, item))      # d - decimal number; s - string
print ("I ordered %10d %s." % (quantity, item))      # d - decimal number; s - string
print ("I ordered %.5d %s." % (quantity, item))

# Note the format of the print() function:
# print ( string % tuple )
# the string will contain placeholders
# the tuple will contain the values that are supposed to replace the placeholders

I ordered 10 eggs.
I ordered 10 eggs.
I ordered    10 eggs.
I ordered         10 eggs.
I ordered 00010 eggs.


In [171]:
quantity = 7531
price = 12.345678

print ("Quantity: %.0u, Price per unit: $%15.6f, Amount: $%.6f" % (quantity, price, quantity * price))   # using u
print ("Quantity: %5i, Price per unit: $%8.2f, Amount: $%15.2f" % (quantity, price, quantity * price))   # using i
print ("Quantity: %5d, Price per unit: $%8.2f, Amount: $%15.2f" % (quantity, price, quantity * price))   # using d

Quantity: 7531, Price per unit: $      12.345678, Amount: $92975.301018
Quantity:  7531, Price per unit: $   12.35, Amount: $       92975.30
Quantity:  7531, Price per unit: $   12.35, Amount: $       92975.30



|Conversion|Meaning|  
|---|---|  
|d|Signed integer decimal|
|i|Signed integer decimal|
|b|Binary numbers|
|o|Unsigned octal|
|u|Signed integer decimal (obsolete)|
|x|Unsigned hexadecimal|
|X|Unsigned hexadecimal|
|e|Floating-point (exponential format)|
|E|Floating-point (exponential format)|
|f|Floating-point (decimal format)|
|F|Floating-point (decimal format)|
|g|Same as "e"|
|G|Same as "E"|
|s|String|
|%|the '%' character |

In [174]:
# print integer value (d)
print("Number of students : %3d, Boys : %2d, Girls : %1d\n" % (240, 110, 130))   # Python is clever!
# no matter the %Xd, if the assigned number is larger than X it will just ignore, not possible to display 130 as 1 only

# print integer and float values -- mix and match (d and f)
print("Quantity : %2d, Length : %5.2f, Width : %5.5f\n" % (1, 5.3_329, 1.4_278))

# print octal value (o)         # 7 is total space u wanted to display, .3 is decimal spaces to display if have
print("%7d is %7.3o in the octal number system\n" % (250, 250))
print("%d is %o in the octal number system\n" % (250, 250))

# print hexadecimal value (x)
print("%5d is %7.3x in the hexadecimal number system\n" % (1_000, 1_000))

# print exponential value (e)
print("speed of light = %10.3e metres per second\n" % (2_9979_2458))

# print string and exponential values (s and e)
print("distance from %s to %s  = %10.3E metres\n" % ("the Sun", "Earth", 149_597_870_700))

Number of students : 240, Boys : 110, Girls : 130

Quantity :  1, Length :  5.33, Width : 1.42780

    250 is     372 in the octal number system

250 is 372 in the octal number system

 1000 is     3e8 in the hexadecimal number system

speed of light =  2.998e+08 metres per second

distance from the Sun to Earth  =  1.496E+11 metres



In [175]:
# print integer values

print("Number of students : %3d, Boys : %2d, Girls : %2d" % (240, 110, 130, 150, 12.5))   # TypeError -- too many numbers!

TypeError: not all arguments converted during string formatting

In [176]:
# print integer values

print("Number of students : %3d, Boys : %2d, Girls : %2d" % (240, 110, 130))   # corrected

Number of students : 240, Boys : 110, Girls : 130


### Printing Using the ```format()``` Method

A newer way is to use the ```format()``` method.

The string ```format()``` method is a handy way to paste values into a string.  It uses the special marker ```{}``` within a string to mark where things go.

Format:

```print (string.format(variables))```

In [195]:
import math

print ("The floating-point value of pi is {0:15.7f}, to {1:7d} decimal places.".format(math.pi, 7))
# still works if you dont specify index but must have colon
print ("The floating-point value of pi is {:.0f}, to {:7d} decimal places.".format(math.pi,7))

The floating-point value of pi is       3.1415927, to       7 decimal places.
The floating-point value of pi is 3, to       7 decimal places.


In [155]:
# using format() method
print ("A good {} textbook is {} by {}.".format('Python', 'Python 3: Pocket Primer', "James Parker"))
print ()

# curly braces will be replace by smth

# using format() method and using position
print ("A good {0} textbook is {1} by {2}.".format('Python', 'Python 3: Pocket Primer', "James Parker"))
print ()

# using format() method and using position
print ("A good {2} textbook is {0} by {1}.".format("Python 3: Pocket Primer", "James Parker", "Python"))
print ()

a = "Python"
b = "Python 3: Pocket Primer"
c = "James Parker"

# using format() method and using position
print ("A good {} textbook is {} by {}.".format(a, b, c))

A good Python textbook is Python 3: Pocket Primer by James Parker.

A good Python textbook is Python 3: Pocket Primer by James Parker.

A good Python textbook is Python 3: Pocket Primer by James Parker.

A good Python textbook is Python 3: Pocket Primer by James Parker.


In [192]:
import math

print (math.pi)
print ()
print ("Pi to 5 decimal places is {:.6g}.".format(math.pi))   # g is significant figure
print ("Usually, we dont need pi to a greater accuracy than {:.3g}.".format(math.pi))
print ("Don't go overboard, a pi value of {:.2g} is extremely inaccurate!".format(math.pi))

3.141592653589793

Pi to 5 decimal places is 3.14159.
Usually, we dont need pi to a greater accuracy than 3.14.
Don't go overboard, a pi value of 3.1 is extremely inaccurate!


In [197]:
discount = 0.35

print ("{:.0%}".format(discount))   # 0.35 = 35%
print ("{:%}".format(discount))
print ("{:1.1%}".format(discount))   # 1 = field width; 1 = decimal places
print ("{:5.1%}".format(discount))   # 5 = field width; 1 = decimal places
print ("{:7.1%}".format(discount))   # 7 = field width; 1 = decimal places
print ("{:10.1%}".format(discount))   # 10 = field width; 1 = decimal places

35%
35.000000%
35.0%
35.0%
  35.0%
     35.0%


In [218]:
quantity = 7
price = 12.34

print ("Quantity: {:.4g}, Price per unit: ${:.4g}, Amount: ${:.4g}".format(quantity, price, quantity * price))
# the .4g is to display the formatted quantity in 4 significant figures

print ("Quantity: {:b}, Price per unit: ${:.4g}, Amount: ${:.4g}".format(quantity, price, quantity * price))
# the b is to display quantity in binary, numbers after : is to modify display

Quantity: 7, Price per unit: $12.34, Amount: $86.38
Quantity: 111, Price per unit: $12.34, Amount: $86.38


#### Parametrized String Formatting

In [187]:
import math

print ("The special number is {}.".format(math.pi))

The special number is 3.141592653589793.


In [188]:
import math

print ("The special number is {:5.2f}.".format(math.pi))

The special number is  3.14.


In [189]:
import math
# incase u are using a parameter on anything, can apply as well
# formatting special number , not pi in to curly braces
# but special number is defined as pi
print ("The special number is {special_number:5.2f}.".format(special_number = math.pi))

The special number is  3.14.


In [190]:
import math

number_of_decimal_places = 4

print ("The special number is {special_number:5.{accuracy}f}.".format(special_number = math.pi, accuracy = number_of_decimal_places))

The special number is 3.1416.


### Printing Using f-Strings

In Python 3.6, a new way of formating strings was introduced.  This method is called f-strings.

Study how following print statements work.

In [198]:
import math

answer = math.e**math.pi

print (f"The answer is {answer}.")
print (f"The answer is {answer:.2f}.")
print (F"The answer is {answer:.2e}.")
print (F"The answer is {answer:.2g}.")

The answer is 23.140692632779263.
The answer is 23.14.
The answer is 2.31e+01.
The answer is 23.


In [199]:
import math

print (f"The answer is {math.e**math.pi}.")
print (f"The answer is {math.e**math.pi:.2f}.")
print (F"The answer is {math.e**math.pi:.2e}.")
print (F"The answer is {math.e**math.pi:.2g}.")

The answer is 23.140692632779263.
The answer is 23.14.
The answer is 2.31e+01.
The answer is 23.


In [200]:
name = "International Business Machines (IBM)"
company_type = "computer hardware"

print(f"{name} is a {company_type} company.")

International Business Machines (IBM) is a computer hardware company.


In [201]:
school = "University of Tokyo"
country = "Japan"

print (f"The {school} is a great university in {country}.")

The University of Tokyo is a great university in Japan.


In [202]:
print (f"4 times 7 equals {4*7}.")

4 times 7 equals 28.


In [203]:
location = "changi airport: terminal 4 (passenger waiting area)"

print(f"{location.title()}")

# function the variable then use title attribute to modify


Changi Airport: Terminal 4 (Passenger Waiting Area)


In [204]:
# This program shows that you can use either lowercase or uppercase "f" to indicate f-strings.

a = "Python"
b = "Python 3: Pocket Primer"
c = "James Parker"

d = "Java"
e = "Core Java"
f = "Cay S. Horstmann"

print (f"A good {a} textbook is {b} by {c}.")       # using lowercase "f"

print (F"A good {d} textbook is {e} by {f}.")       # using uppercase "f"

A good Python textbook is Python 3: Pocket Primer by James Parker.
A good Java textbook is Core Java by Cay S. Horstmann.


In [205]:
quantity = 7
price = 12.34
# total = quantity * price

print (f"Quantity: {quantity}, Price per unit: ${price}, Amount: ${quantity * price}")     # small f

print ()

print (F"Quantity: {quantity}, Price per unit: ${price}, Amount: ${quantity * price}")     # capital F

Quantity: 7, Price per unit: $12.34, Amount: $86.38

Quantity: 7, Price per unit: $12.34, Amount: $86.38


This is a multi-line version of f-string.

In [206]:
print ("English" "French" "German" "Russian")

EnglishFrenchGermanRussian


In [207]:
name = 'John Doe'
age = 32
occupation = 'gardener'

personal_details = (
    f'Name: {name}\n'
    f'Age: {age}\n'
    f'Occupation: {occupation}'
)


print ( personal_details )

Name: John Doe
Age: 32
Occupation: gardener


#### Formatting f-strings

In [208]:
value = 12.3

print(f'{value}')
print(f'{value:.2f}')   # f = floating-point value; 2 = 2 decimal places
print(f'{value:.5f}')   # f = floating-point value; 5 = 5 decimal places

12.3
12.30
12.30000


In [214]:
for x in range(-3, 11):
    print(f'{x:+05} {x*x:03} {x*x*x:+09}')
    
# anything after : is all to modify the display of length of output

-0003 009 -00000027
-0002 004 -00000008
-0001 001 -00000001
+0000 000 +00000000
+0001 001 +00000001
+0002 004 +00000008
+0003 009 +00000027
+0004 016 +00000064
+0005 025 +00000125
+0006 036 +00000216
+0007 049 +00000343
+0008 064 +00000512
+0009 081 +00000729
+0010 100 +00001000


In [213]:
for x in range(-3, 11):
    print(f'{x:+05} {x*x:03} {x*x*x:2}')
# if no + sign then with not have plus minus sign
# function  x, which is from -3 to 10, display in 5 spaces, the 0 is to substitute the empty spaces in front

# {x:+05} - print x, include the sign, pad with zeros
# {x*x:03} - print x*x, do not include the sign unless it is negative, pad with zeros
# {x*x*x:2} - print x*x*x, do not include the sign unless it is negative, do not pad with zeros

-0003 009 -27
-0002 004 -8
-0001 001 -1
+0000 000  0
+0001 001  1
+0002 004  8
+0003 009 27
+0004 016 64
+0005 025 125
+0006 036 216
+0007 049 343
+0008 064 512
+0009 081 729
+0010 100 1000


In [210]:
s1 = 'a'
s2 = 'ab'
s3 = 'abc'
s4 = 'abcd'

print (f'{s1:^10}')   # centre the string
print (f'{s2:<10}')   # left justify the string
print (f'{s3:>10}')   # right justify the strings
print (f'{s4:^10}')

    a     
ab        
       abc
   abcd   


In [211]:
a = 300

# hexadecimal
print (f"{a:x}")   # display the integer in hexademical format (lower case: a, b, c, d, e)
print (f"{a:X}")   # display the integer in hexademical format (upper case: A, B, C, D, E)
print ()

# octal
print (f"{a:o}")   # display the integer in octal format
print ()

# scientific notation
print (f"{a:e}")   # display the number in scientific format (lowercase: e)
print (f"{a:E}")   # display the number in scientific format (lowercase: E)

12c
12C

454

3.000000e+02
3.000000E+02


In [228]:
string = "Singapore Polytechnic"

print (f"The school's name begins with {string:.9} ...")

# the .9 is only to display nine spaces

print (f"The school's name begins with {string:.4} ...")

# unlike numbers with display all e.g. %2d but 1234,they just stop at ctly position

The school's name begins with Singapore ...
The school's name begins with Sing ...


### Conditional String Formatting

In Python 3.8, the f-strings method of printing was enhanced.  The enhanced version of f-strings is called self-documenting f-strings.

In [233]:
number = 123.456789

print ( f'The number is: {number:{".2f" if number > 100 else ""}}' )
print ( f'The number is: {number:{".2f" if number > 1000 else ""}}' )

# print as 2 decimal places when is larger than X

number = 23.456789

print ( f'The number is: {number:{".2f" if number > 100 else ""}}' )
print ( f'The number is: {number:{".1f" if number > 10 else ""}}' )

The number is: 123.46
The number is: 123.456789
The number is: 23.456789
The number is: 23.5


In [221]:
name = "alice"

print (f"Name: {name.upper() if 1 < len(name) < 6 else name.capitalize()}")

name = "benedict"

print (f"Name: {name.upper() if 1 < len(name) < 6 else name.capitalize()}")

Name: ALICE
Name: Benedict


### Printing Using Self-documenting f-Strings

In Python 3.8, the f-strings method of printing was enhanced.  The enhanced version of f-strings is called self-documenting f-strings.

In [222]:
# This is how it's done without self-documenting f-strings.

import math

x = 0.8

print (f'math.sin(x) = {math.sin(x)}')
print (f'math.cos(x) = {math.cos(x)}')

math.sin(x) = 0.7173560908995228
math.cos(x) = 0.6967067093471654


In [223]:
# With self-documenting f-strings (introduced in Python 3.8), we have a more compact way of coding this ...

import math

x = 0.8

print (f'{math.cos(x) = }')                                      # Python 3.8+
print (f'{math.sin(x) = }')                                      # Python 3.8+

math.cos(x) = 0.6967067093471654
math.sin(x) = 0.7173560908995228


## Handing Strings with Characters from Other Languages

When handing characters from languages other than English, the chr() and ord() functions are indespensible!

The chr() function takes a positive integer and converts it to its corresponding alphabetic value.

* The letters A to Z have a decimal representation of 65 through 91.
* The letters a to z have a decimal representation of 97 through 122.

The ord() function does the opposite.  It takes in a character, and provides the decimal representation.

In [224]:
print ( chr(65) )   # the chr() function converts an integer into a character

print ( ord('A') )   # the ord() function converts a character into an integer

# Both functions do opposite things!

A
65


In [225]:
print ("The UNICODE representation of 李 is:", ord("李"))        # returns an integer representing "李"
print ("The UNICODE representation of 光 is:", ord("光"))        # returns an integer representing "光"
print ("The UNICODE representation of 耀 is:", ord("耀"))        # returns an integer representing "耀"

# play around with unicode, using ord() and chr()

print ()
print ( chr(26446), chr(20809), chr(32768) )           # returns a character whose Unicode code point is the integer

print ()
print ( chr(26446), chr(20809), chr(32768), sep="" )   # and empty string = no separation between the characters

# unicode is the hex form of ord

print ()
print ( hex(26446) )       # returns the hexadecimal representation of 26446
print ( hex(20809) )       # returns the hexadecimal representation of 20809
print ( hex(32768) )       # returns the hexadecimal representation of 32768

print ()
print ("\u674e\u5149\u8000")                       # \u means not printing literally but refer to the unicode number of e.g. 674e
print ("\u0420\u043e\u0441\u0441\u0438\u044f")
print ("\U0001F923")
print ("\N{thinking face}")                         # Using the Common Locale Data Repository (CLDR) short name

# For a complete list of CLDR short names, consult https://unicode.org/emoji/charts/full-emoji-list.html

The UNICODE representation of 李 is: 26446
The UNICODE representation of 光 is: 20809
The UNICODE representation of 耀 is: 32768

李 光 耀

李光耀

0x674e
0x5149
0x8000

李光耀
Россия
🤣
🤔


## The ```center()```, ```ljust()``` and ```rjust()``` Methods

These methods are used to justify text.

In [226]:
text = "Experimenting with strings."

print (len(text))
print ()

# Printing the center aligned string
print ("Center aligned string:")
print ( text.center(50) )
print ()

# Printing the left aligned string
print ("Left aligned string:")
print ( text.ljust(50) )
print ()

# Printing the right aligned string
print ("The right aligned string:")
print ( text.rjust(50) )                        # differnt a bit from lstrip, rstrip

27

Center aligned string:
           Experimenting with strings.            

Left aligned string:
Experimenting with strings.                       

The right aligned string:
                       Experimenting with strings.


### Truncating Strings

Strings can also be truncated.  In the cell below, only the first nine letters of the string stored in the variable *string* is printed.

In [227]:
string = "Singapore Polytechnic"

print (f"The school's name begins with {string:.9} ...")   # just the first 9 characters of the string

The school's name begins with Singapore ...


### Using Padding

To see this effect properly, let's pad the string with characters.

In [234]:
text = "Experimenting with strings."

# Printing the center aligned string padded with "%"
print ("Center aligned string, padded with '%':")
print ( text.center(50, '%') )
print()

# Printing the left aligned string padded with "-" 
print ("Left aligned string, padded with '-':")
print ( text.ljust(50, '-') )
print()

# Printing the right aligned string padded with "*"
print ("The right aligned string, padded with '*':")
print ( text.rjust(50, '*') )

Center aligned string, padded with '%':
%%%%%%%%%%%Experimenting with strings.%%%%%%%%%%%%

Left aligned string, padded with '-':
Experimenting with strings.-----------------------

The right aligned string, padded with '*':
***********************Experimenting with strings.


In [235]:
# Let's try something fancy (that results in an error -- a TypeError!).

text = "Experimenting with strings."

# Printing the center aligned string padded with "%!%!%!%!%!"
print ("Center aligned string, padded with : ")
print ( text.center(50, '%!') )   # TypeError - the padding needs to be *exactly* one character long
print()

Center aligned string, padded with : 


TypeError: The fill character must be exactly one character long

## The ```zfill()``` Method

The ```zfill()``` method adds zeros (0) at the beginning of the string, until it reaches the specified length.

If the value of the len parameter is *less* than the length of the string, *no filling is done*.

In [236]:
text = "Singapore is a great city to live in!"

print ( len (text) )           # text has 37 characters
print ()
print ( text.zfill(60) )       # 60 > 37
print ( text.zfill(50) )       # 50 > 37
print ( text.zfill(40) )       # 40 > 37
print ( text.zfill(30) )       # 30 < 37
print ( text.zfill(20) )       # 20 < 37
print ( text.zfill(10) )       # 10 < 37
print ( text.zfill(5) )        #  5 < 37

37

00000000000000000000000Singapore is a great city to live in!
0000000000000Singapore is a great city to live in!
000Singapore is a great city to live in!
Singapore is a great city to live in!
Singapore is a great city to live in!
Singapore is a great city to live in!
Singapore is a great city to live in!


### The ```translate()``` Method

In [237]:
original_string = "hello, wkwsci!"

translate_table = original_string.maketrans("abcdefghijklm", "nopqrstuvwxyz")
# .maketrans attribute

original_string.translate(translate_table)

'uryyo, wxwspv!'

The use of the ```translate()``` method to delete characters is demonstrated below.

In [238]:
original_string = "hello, wkwsci!"

translate_table = original_string.maketrans({'l':None,'w':None})

original_string.translate(translate_table)

'heo, ksci!'

## The ```input()``` Function

The input() method reads a line from input, converts into a *string* and returns it.

A prompt can be added.

Question: Why are we discussing the ```input()``` function in a session that discusses strings?

In [None]:
what_the_user_keyed_in = input()                             # no prompt -- very user unfriendly!

print ("You keyed in:", what_the_user_keyed_in)

print (type(what_the_user_keyed_in))

In [None]:
country = input("Key in the name of a country: ".strip)            # the prompt enables the user to know what to key in

print ("You keyed in:", country)

In [None]:
name = input("What is your name? ")

print ("Good morning, ", name, ".", sep = "")

In [None]:
name = input("What is your name? ")

print ("Good morning, " + name + ".")

In [None]:
number = input("Key in a number: ")   # the user knows that he is supposed to key in a number

# run this program, and key in 123 to see what happens
# run the program again, and now key in 123.123 to see what happens -- explain why

print ("number is of type:", type(number))   # the input() function converts it to a string

print ()

print ("You keyed in this number:", number)

number = int(number)

print ("Squaring that number gives:", number*number)   # type casting is needed before calculations

In [None]:
# Explaining what happened above:

print (int (123.456))   # no problems, the value is 123
print (int ("123.456"))   # this results in a ValueError

# Remember -- the input() function reads in the user input as an str (string) type

In [None]:
# This is an alternative to the above -- the type casting is done twice

number = input("Key in a number: ")   # the user knows that he is supposed to key in a number

print ("number is of type:", type(number))   # the input() function converts it to a string

print ()

print ("You keyed in this number:", number)

print ("Squaring that number gives:", int(number)*int(number))   # type casting done twice here

In [None]:
x, y = "3       4".split()

print ( type(x) )                                            # x is a string (str)
print ( type(y) )                                            # y is a string (str)
print ("The sum is", float(x)+float(y))                      # x and y need to be typecast before performing calculations

## Multiple Assignment

You can assign multiple values to multiple variables by separating variables and values with commas.

In [None]:
a = 100   # a is assigned the value of 100
b = 200   # b is assigned the value of 200

print (a, b, sep = " -- ")

In [None]:
a, b = 100, 200   # same as above -- this is called multiple assignment

print (a, b, sep = " -- ")

You can assign more than two variables at one go.

In [None]:
a, b, c = 100, 200, 300   # assigning values to three variables with a single statement

print (a, b, c, sep = " !! ")

In [None]:
a, b, c, d, e = "Singapore", "Malaysia", "Thailand", "Indonesia", "Philippines"

print (a, b, c, e, d, sep = " ** ")

In [None]:
a, b, c, d = 100, 200, 300   # ValueError -- number of variables and values must match!

print (a, b, c, d, sep = " ** ")

Multiple assignments works with different types of data.

In [None]:
a, b, c = 1, 3+4j, "First Steps Towards Programming"

print (a, type(a), sep = ', ')
print (b, type(b), sep = ', ')
print (c, type(c), sep = ', ')

In [None]:
a, b, c, d = 1, [5.5, 6.6, 7.7], {'topic' : 'mathematics', 'difficulty' : 'easy'}, ("January", "February", "March")

print (a, ":", type(a))
print (b, ":", type(b))
print (c, ":", type(c))
print (d, ":", type(d))

## Using the ```help()``` Function

We have already looked at two methods of getting help in Python:

* using the ```dir()``` built-in function  
* using the question mark ("```?```")

You can also use the Python help() function to get the documentation of specified module, class, function, variables etc.  The syntax is:

```help([object])```

In [None]:
help ( print )              # obtain help on the print() function

In [None]:
help ( "print" )            # same as above!

In [None]:
import math

help ( math.sqrt )   # help on the math.sqrt() function

In [None]:
import math

help ( "math.sqrt" )   # same as above

In [None]:
help ( statistics.mean )   # you get a NameError -- why?

In [None]:
help ( 3 )   # help on a integer (int) object

In [None]:
help ( 4.5 )   # help on a floating-point (float) object

In [None]:
help ( 3+4j )   # help on a complex object

In [None]:
help ( "Singapore" )

# from the previous three cells, you expect help on a string object
# you don't get a friendly error message
# the message instructs you to use help(str) if you need help on strings

In [None]:
help ( str )

## Summary: Three Ways to Get Help

1. use ```?```
2. use the ```dir()``` function
3. use the ```help()``` function