### String Comparisons

We will examine now the concept of comparisons among strings and introduce a few comparison operators. 

#### Equality comparison

Let's first examine how we can check if two strings are identical. For this comparison, we need the equality operator `==`.

Let's see how equality comparisons work in Python:

In [None]:
str1 = "hello" # assignment to var str1

In [None]:
print(str1 == "I'm good")

In [None]:
print(str1 == "Hello")

In [None]:
print(str1 == "hello")

Notice that **capitalization matters** when comparsing strings in Python. If we want to make the comparison case-insensitive we typically first convert both sides of the equality to the same case:

In [None]:
print(str1.lower() == "Hello".lower() )

The opposite operator for equality is the inequality operator: `!=`. For example:

In [None]:
email1 = "Junpei@nyu.edu"
email2 = "Keiko@nyu.edu"
print("Are the emails different?", email1 != email2 )

#### Ordering Strings

String also allows inequality comparisons. When we compare strings, the string that is "smaller" is the one that is coming first in the dictionary. Let's see an example: 

In [None]:
name1 = 'Abraham'
name2 = 'Bill'

# Abraham is lexicographically before Bill
print(name1 < name2)

In [None]:
name1 = 'Donald'
name2 = 'Bill'

# Donald is lexicographically after Bill
print(name1 < name2)

Notice though the following, where the capitalization of `Bill` changes:

In [None]:
name1 = 'Donald'
name2 = 'bill'

# Donald is lexicographically before bill
print(name1 < name2)

What causes this is the fact that the order is not simply the order in which we would encounter words in the dictionary. Technically, strings are ordered based on the order of the characters in the ASCII (or Unicode) table. Here is the ASCII table:

<img src = "https://www.asciitable.com/asciifull.gif">

For example, if we have the string below, and we try to sort them, take a look at the order:

In [None]:
# Space, followed by numbers, followed by uppercase, followed by lowercase
sorted(['Bill', '  ZZ TOP!!! ', 'HAHA', 'lol', 'LOL!', 'ZZZ', 'zzz', '123', '1230', '345'])

In [None]:
# Example of string comparison
# See ASCII table at http://www.asciitable.com/ for character order (FYI)

name1 = 'Ada'
name2 = 'Bill'

# Ada is lexicographically before Bill
print(name1 < name2)

name1 = 'ada'
name2 = 'Bill'
# However 'ada' is lexicographically after Bill (which starts with an uppercase letter)
print(name1 < name2)

### Finding text within string variables

####  `in` operator


+ The `in` operator, `needle in haystack`: reports if the string `needle` appears in the string `haystack`


For example, string "New York" appears within "New York University", so the following operator returns `True`:

In [None]:
"New York" in "New York University"

But, unlike reality, "New York University" is not in "New York" :-)

In [None]:
"New York University" in "New York"


####  `find` function

* The `find` function, `haystack.find(needle)`: searches `haystack` for `needle`, prints the position of the first occurrence, indexed from 0; returns -1 if not found.

For example:

In [None]:
word = "Python is the word. And on and on and on and on..." 
position = word.find("on") # The 'on' appears at the end of 'Python'
print(position)

In [None]:
print("The first time that we see the string on is at position", word.find("on"))

In [None]:
# Find a needle that is not in the haystack.
needle = "whatever"
haystack = "He can do whatever it takes to win re-election, and the Republican Party will have his back."
haystack.find(needle)

(advanced) If we are looking to find additional appearances of the string, then we can add a second parameter in the `find` function, specifying that we are only interested in matches after the position specificed by the parameter.

In [None]:
first_appearance = word.find("on")
second_appearance = word.find("on", first_appearance+1)
print("The second time that we see the string on is at position", second_appearance)

#### Exercise

Consider the string `billgates@microsoft.com`. Write code that finds the username of the email address and the domain of the email address. You will need to use the .find() command, and also use your knowledge of indexing and slicing for this exercise. Hint: You will need to search for the `@` character using find, and then use the result to get the parts of the string before and after the `@` character. (Do not worry if this seems tedious, this is mainly for practice; later on, we will see how to do this in an easier way.)


In [None]:
# your code here
email = "billgates@microsoft.com"

####  `count` function

+ `str_1.count(str_2)`: counts the number of occurrences of one string in another.

In [1]:
word = "Python is the word. And on and on and on and on..."
lookfor = "on"
count = word.count(lookfor)
print( "We see the string '", lookfor  ,"' that many times: ",  count)

We see the string ' on ' that many times:  5


In [2]:
word = "Python is the word. And on and on and on and on..."
lookfor = "Python"
count = word.count(lookfor)
print( "We see the string '", lookfor  ,"' that many times: ",  count)

We see the string ' Python ' that many times:  1


Of course, notice that if capitalization is different, the matches will not "count".

In [3]:
word = "Python is the word. And on and on and on and on..."
lookfor = "PYTHON"
count = word.count(lookfor)
print( "We see the string '", lookfor  ,"' that many times: ",  count)

We see the string ' PYTHON ' that many times:  0


#### Exercise

Convert the code above so that it works in a case-insensitive manner. Use the `lower()` or `upper()` command.

In [None]:
# Your code here.

#### Exercise

Consider the news article from [Eater](https://ny.eater.com/2021/2/2/22255098/dining-outdoor-winter-challenges-critic-robert-sietsema), which is given below, and stored in the string variable `article`.

* Count how many times the word `cold` appears in the article. .
* Count how many times the word `warm` appears in the article. 
* Now sum up the occurences of `cold` and `warm` and display the percentage of coverage for each of the two strings. (For example, if warm appears 2 times and cold 3 times, then warm is 40% and cold is 60%.)

In [4]:
article = """
A friend of mine claims she loves eating outside in the winter, maybe even more
 than being in a heated dining room. She brings a shawl with her, but sits on
 it rather than wraps it around her shoulders, a good reminder that heat can
 escape from any surface of the body. “It’s like eating in a ski resort,”
 she tells me, which is a jolly thought.
As a restaurant critic, I’m eating outdoors a lot these days.
 While most people in the city may brave the occasional sub-freezing streetside
 dinner, I’m out a few nights a week at least. And because I’m dining out
 cautiously — avoiding overbuilt, shed-like setups that don’t allow for airflow
 — I’m resigned to being exposed to the elements to avoid being
 exposed to the virus.
While an alfresco dinner in summertime seems like the most pleasant thing in
 the world, eating outdoors in the New York winter is quite a different kettle
 of fish. Icy weather introduces a host of new quirks to the dining experience,
 leaving us with numb fingers, baked faces, cold food that ought to be warm,
 and grease spots on our coats. Nevertheless, there’s a real joy to be found
 in persevering and enjoying good food in spite of adversity.
With coronavirus-era dining, our cherished habits have been upended.
 Like many in the city, I formerly ate at 8 p.m. or so, and my colleague
 Ryan Sutton used to eat much later in the days when many restaurants were open
 till midnight. Nowadays the rules state that eateries must close by 10 p.m.
 — and that means really close, so you can’t sashay in at, say, 9:45 p.m. and
 expect to be fed, as I did recently with two pals at a Portuguese restaurant
 in the West Village. They served us a cocktail, but no food since
 the kitchen was already shut.
Basking beneath multiple electric heat lamps, we talked about the superiority
 of electricity over propane gas, with the former providing a more even and 
 predictable heat. Even the electric ones are a mixed blessing. That’s because 
 they only broil one side of you, leaving the other side still cold. If you 
 could turn yourself like a rotisserie, you might be happy, but whether 
 overhead or in front, heat lamps can nearly blister a bald head or fry 
 a face in the course of a meal.
By leaving your coat, hat, and gloves on, and intermittently applying your
 mask, you can partly insulate yourself from this omnidirectional heat,
 but will gradually begin to swelter on one side, even when it’s freezing 
 outside. And every spot with a liquor license, we decided, ought to offer 
 warm cocktails in the cold winter months, because there’s nothing worse than 
 sitting down on a frigid evening and slurping a drink on the rocks, 
 which makes you as cold inside as out.
These radiant heat sources may be more effective in an enclosed space, 
which brings up another topic. In their quest to make outdoor winter dining 
more comfortable, many restaurants have overbuilt their street enclosures, 
often making them look like garden sheds. Impromptu versions of this dodgy 
arrangement enclose all four sides with plastic and other temporary materials.
Current regulations state that outdoor spaces may only be penned in on two 
 sides, though I’ve seen three-sided enclosures look safe as long as maximum 
 air movement is guaranteed — and they are technically allowed at 25 percent 
 capacity. But many of today’s outdoor spaces are way too sealed up, 
 and the tiny plastic partitions between groups are a laughable defense against 
 floating and flowing viruses. Remember, catching COVID involves extended 
 exposure to someone who has the virus in an enclosed space, 
 and a semi-enclosed space can be nearly as bad.
This fear forces me to arrive at a restaurant and eyeball the exterior dining 
 space to see if enough air is circulating, automatically rejecting those places
 enclosed on four or occasionally even three sides. But you often don’t know 
 the safety situation till you get there, forcing me to sometimes reject a 
 space and carry the meal out. My friends and I must then find a park bench to 
 eat on, or sit in a car with windows rolled down and the air system blasting. 
 No meal is so good it’s worth getting sick for.
"""


In [5]:
# Calculate the number of times that "cold" appears in the text
word = article
lookfor = "cold"
count = word.count(lookfor)
print( "We see the string '", lookfor  ,"' that many times: ",  count)

We see the string ' cold ' that many times:  4


In [None]:
# Calculate the number of times that warm appears in the text


In [1]:
# Compute the percetage for cold (vs total cold+warm)
perc_cold = YOUR_SOLUTION_HERE

NameError: name 'YOUR_SOLUTION_HERE' is not defined

In [None]:
# Compute the percetage for warm (vs total cold+warm)
perc_warm = YOUR_SOLUTION_HERE

In [None]:
# All together
print("cold",perc_cold,"%")
print("warm",perc_warm,"%")

#### `startswith` and `endswith` functions

Finally, we can also check if a particular string starts or ends with a another substring

+ `haystack.startswith(needle)`: does a the haystack string start with the needle string?
+ `haystack.endswith(needle)`: does a the haystack string end with the needle string?


In [6]:
name = "New York University"
prefix = "New York"
print( "Does " + name + " starts with" + prefix + "?")
print(name.startswith(prefix))

Does New York University starts withNew York?
True


In [7]:
name = "New York University"
prefix = "University"
print( "Does " + name + " starts with" + prefix + "?")
print(name.startswith(prefix))

Does New York University starts withUniversity?
False


In [8]:
name = "New York University"
suffix = "University"
print( "Does " + name + " ends with" + suffix + "?")
print(name.endswith(suffix))

Does New York University ends withUniversity?
True
