<a href="https://colab.research.google.com/github/manolan1/PythonNotebooks/blob/main/IntroToPython\Chapter%202%20Variable%20Fundamentals\Exercise%202_1%20String%20Methods%20(Solution).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exercise 2.1: String Methods

## Questions

## Question 1

1. How would you do a case-insensitive comparison, given the methods of str, on the strings "NICE dragons FINISH Last" and 'Nice Dragons Finish Last'? This is tricky. If you do not see the answer in 2 minutes move on:

This is suprisingly difficult to do really well!

In [None]:
s1 = "NICE dragons FINISH Last"
s2 = 'Nice Dragons Finish Last'

The simple solution, often given is:

In [None]:
s1.lower() == s2.lower()

However, that will not work in all languages. Consider:

In [None]:
print("ß".lower())
print("ß".upper().lower())
print("ß".upper().lower() == "ß".lower())

It actually does pretty well because it recognises the difference between medial and final sigma:

In [None]:
print("Σίσυφος".upper())
print("Σίσυφος".upper().lower())
print("ΣΊΣΥΦΟΣ".lower())

But, it is not good enough as a general solution.

For this reason, there is the `str.casefold` function:

In [None]:
"ß".lower().casefold() == "ß".upper().casefold()

Or, for our question:

In [None]:
s1.casefold() == s2.casefold()

In some cases, even `casefold()` may not be enough. If your text contains unicode characters with accents, those accents may be generated in two ways: either as dedicated letters (the most common accented letters in Western European languages all have their own character), or as combining glyphs (basically the unaccented character with an accent superimposed).

In these cases, the `unicodedata` module contains the function `normalize` that takes care of these equivalents, but that is beyond the scope of this course.

## Question 2

2. Given the string "Nice Dragons Finish Last", return the string split into a list of strings.

In [None]:
"Nice Dragons Finish Last".split()

## Question 3

3. To check to see if a string can be converted to a base 10 int, which method should be used? Give an example:

Again, this is suprisingly difficult to do really well (but not as difficult as the caseless comparison).

If the number is a positive integer, then the following will work:

In [None]:
print("42".isnumeric())
print("42".isdigit())
print("42".isdecimal())

Except that `isnumeric()` probably isn't what we are looking for:

In [None]:
print("\u2155")
print("\u2155".isnumeric())
print("\u2155".isdigit())
print("\u2155".isdecimal())

And probably neither is `isdigit()`:

In [None]:
print("2\u00b2")
print("2\u00b2".isdigit())
print("2\u00b2".isdecimal())

Fortunately, depite its name, `isdecimal()` doesn't check for decimal, aka floating point, numbers; it checks for numbers in the decimal base (base 10).

In [None]:
"42.0".isdecimal()

Note that `"42.0"` is not an integer, even though it could be converted to one - by convention, adding `.0` to a number makes it a floating point number.

Unfortunately, it doesn't work for negative integers (in fact none of these functions do):

In [None]:
"-42".isdecimal()

You could try using a slice (we'll learn about them later):

In [None]:
s3 = "-42"
s3.isdecimal() or (s3.startswith('-') and s3[1:].isdecimal())

Basically, the `[1:]` says _all characters from 1 onwards_ (zero-based counting, so the `-` is character 0 in the string).

This will still trip up over `+42`. You could obviously write increasingly complicated checks to take account of these situations, but many of them will have a problem with something. In the end, you may conclude that the best way is to try the conversion and intercept the error. We'll learn how to do that later, but here it is for completeness (all wrapped up in a function - we'll learn more about them soon, too):

In [None]:
def isinteger(value: str) -> bool:
    try:
        int(value)
        return True
    except ValueError:
        return False

In [None]:
print("is 42 an integer?", isinteger("42"))
print("is -42 an integer?", isinteger("-42"))
print("is 42.0 an integer?", isinteger("42.0"))
print("is +42 an integer?", isinteger("+42"))
print("is 2\u00b2 an integer?", isinteger("2\u00b2"))

# End of Notebook