# Worksheet 1.0.0: Strings

<div class="alert alert-block alert-info">
    This worksheet implements to-do markers where work needs to be completed. In some cases, this means that you'll need to add a line or two to an example. In other cases (such as the final exercise), you may need to solve an entire problem.
</div>

## `string`ing you along

To this point in the semester, we've worked with `string` objects largely through `print` statements. While we will still do some of that, this week, we're exploring the more advanced world of what lies beneath the surface.

### What is a string?

The "textbook" definition of the `string` is that it's simply a collection of _symbols_. These symbols can be anything that is computer-recognizable. Typically this means letters (`qweituortrgdafgdg`), symbols (`##@*!#!)` -- again, I'm not _that_ upset about it), or numbers-as-symbols. This last one is a little strange. Just remember that:

$$ 4 \neq{"4"} $$

One of the above, `4` is the _integer_ representation -- a number. The other, `"4"` is the `string` representation of the symbol `4`. This is why we get the lovely `TypeError` in when we attempt to do the following:

```
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-ebbc32cdfe85> in <module>
----> 1 s = 4  + "4"

TypeError: unsupported operand type(s) for +: 'int' and 'str'
```

Back to the definition: we can then say that a `string` is simply a group of symbols known as _characters_ which can display nearly anything that language can provide.

But, I can also tell you that your computer doesn't actually care about the symbol. That's just a human convenience. All a system really cares about is that _each symbol is stored at a different place_ in a gigantic set of characters called the Unicode Coded Character Set (or just "Unicode"). Speaking of the majority of English-language characters, we can say they're stored in a _subset_ (that is, smaller portion of) this standard that we call [ASCII (the American Standard Code for Information Interchange)](http://www.asciitable.com/).

This table, known as the "ASCII Table" contains entries for various letters which refer to decimal (or `DEC`) representations of symbols which actually refer to letters. Here's what that looks like from a programming perspective:

In [None]:
# Notice that they're different

print("The ASCII character code for 'a':",ord("a"))
print("The ASCII character code for 'A':",ord("A"))

Here, we use a built-in function called `ord` to get the _ordinal_ (that is the Unicode "code point" -- address of) a given character. Again, because the _symbol_ `A` looks and behaves distinctly differently from the symbol `a`, their addresses in the table are _different_. Try it for yourself below.

#### 1a. `print` the ordinal values of at least `4` different characters or symbols.

In [None]:
# TODO

If we have _ordinal_ values (numbers), we can turn them back into characters using `ord`'s opposite, `chr` using the same syntax as we would for `ord`.

1b. `print` the character values of the following `5` ordinal values:

* `71`
* `46`
* `87`
* `105`
* `122`

In [None]:
# TODO

### Full disclosure

I am required by the Ethical Code of All Computer Science Teachers<sup>TM</sup> to make you aware of the fact that I've been telling you a convenient lie all semester.

<img src = "https://i.imgflip.com/4i1d15.jpg">

This is because `string` objects, while appearing like any other _data type_ (`float`, `int`, `boolean`, et al.) really _aren't_. We can treat `string`s like a _data type_ **and** a _data structure_. 

Let's take a look at an example.

In [None]:
cat_name = "Ulysses"

for letter in cat_name:
    print(letter)

Notice that _each `character`_ `print`s on a separate line. This is because `string`s function a lot like a _data structure_ -- specifically `list`s. 

### Similarities to `list`s

* a _known length_
* access via _indexes_
* the abilty to _slice_ them
* ability to iterate over them

In [None]:
# known length
print("cat_name length:", len(cat_name))

# access via index
print("cat_name index 4:", cat_name[4])

# ability to slice
print("cat_name sliced:", cat_name[2:5])

print("Iteration:")
for letter in cat_name:
    print(letter)

### Differences from `list`s

* `string`s are _immutable_: we can't change them _directly_
* different _methods_ than a `list`
* `string`s only accept `string` data (we can't mix data types)

#### Methods

We've learned a bit about _methods_ in this course, though we can't spent a lot of time with them. Specifically, we know things like `append` and `remove` vis-a-vis our use of them with `list`s. REgardless of the object type we're working with, the syntax is always the same:

$$ variable\_name.method\_name(arguments) $$

As concerns `string`s, we have quite a few things we can do with them (this is a _small sampling_):

| Method | Argument(s) |Effect | Example |
|--------|-------------|-------|---------|
|`.lower()`|None | Converts entire string to lower case | `a_string.lower()` |
|`.upper()`|None |Converts entire string to upper case | `a_string.upper()` |
|`.count()`|`string` to count instances of|Counts the number of times a given substring appears in a `string`| `a_string.count("a")` |
|`.endswith()`|`string`/`tuple` of `string`s to look for | Returns `boolean` if string does/n't end with `string` argument | `a_string.endswith("ing")`|
|`.startswith()`|`string`/`tuple` of `string`s to look for| Returns `boolean` if string does/n't start with `string` argument | `a_string.startswith("re")`|
|`.replace()`|`string` to find, `string` to replace it with, `integer` times to replace| Replace searched string with specified replacement `N` times| `a_string.replace("Ulysses","Boss")` |
|`.split()`|`string` on which to to "split" `string` (default: spaces) |Splits a `string` into parts (a `list`)| `a_string.split(",")`|
|`.join()`|`list` to "glue" together into a string|Fuses a `string` together from a `list` of `string`s| `",".join(a_lsit)`|

In the above table, `join()` behaves a bit differently than the others. See the following:

In [None]:
sentences = ["It was the best of times","it was the worst of times."]
", ".join(sentences)

The "glue" (the `string` that joins the two `list`s together) comes _first_.

#### Detour into `string` immutability

We say a `string` is "immutable," though clearly some of the methods above alter the contents of a `string`. Let's observe this hands on.

In [None]:
introduction = "My cat's name is Ulysses."
print(introduction.replace("Ulysses","The Boss"))
print(introduction)

As we can see in the above example, calling the `replace` method on the `string` _doesn't change the underlying `string`_ (as shown when we `print` it again at the end). Instead, it creates a new copy of the `string` which contains the replacement. This is what we mean when we say that `strings` are _immutable_ -- unless we re/assign them, the original data doesn't change.

#### 2. Complete the following examples.

In [None]:
# Example 1

while True:
    choice = input("Tell me something (enter [N]o to quit): ")
    if # TODO use the upper method on choice to make either "n" or "N" valid choices to end the loop
        print("Stop telling me stuff!")
        break

In [None]:
# Example 2

animals = ["cat","rat","antelope","bat","anteater","ant","abalone","python","Ulysses"]
for animal in animals:
    # TODO Finish the following statement to print `True` if an animal's name starts with the letter "a"
    print(animal + ":",# TODO)

In [None]:
# Example 3

# This is a lie
bad_message = "My cat Ulysses is the worst."
# TODO Use the replace method to substitute the word "worst" for the word "best";
#      assign the output to good_message
print(good_message)

In [None]:
# Example 4

quote = "Colorless green ideas sleep furiously."


words = # TODO: Split quote on spaces
print(words)

joined = # TODO: join words together using spaces
print(joined)

In [None]:
# Example 5

verse = "Betty bought some butter, but the butter was bitter, so Betty bought some better butter to make the bitter butter better."

words = []

# TODO Count the number of words that start with the letter b, display those words in a list

print(words)

### "Control" characters

While we've been thinking about `string`s as visible characters -- letters, numbers, symbols, et al. -- they can also contain _nonprinting_ characters that perform various formatting tasks. Among the most use:

| Control character | Purpose |
|-------------------|---------|
| `\n`              | Insert a new line |
| `\t`              | Insert a tab |

It's easier to see how these work by looking at examples.

In [None]:
# New line

print("This is one sentence.\nAnd this is another!")

Notice that these are written in the `print` function as _one_ line, but actually print as _two_! Of course, we can chain any number of these together:

In [None]:
print("It all looks like one line.\nBut, we know it's not!\nWhen we run the code, it's simple to see.")

Knowing this will come in extra handy in our next worksheet! For now, let's look at the other one we've picked out: `\t`.

In [None]:
# Tab

print("Name:\tG. Wiz")
print("Type:\tGator")
print("Job:\tWizard")
print("Description:\tA gator who is also a wizard whose name is G. Wiz.")

#### Use nonprinting characters `\n` and `\t` to print the following examples.


##### 3a. `print` the following haiku using only `1` `print` statement

(G. Wiz thinks this is their best work yet.)

```
Yep. I'm a gator.
You think that's strange? Well, also
I practice magic.
```

In [None]:
# TODO

##### 3b. `print` the following using `\t` _and_ `\n` characters.

```
1    2    3
4    5    6
7    8    9
```

(If you're up to the challenge, can you do this with a loop? You will need to use the `%` operator, which tests if a value is divisible by an integer.)

In [None]:
# TODO

### `f`ormatting your `string`s

Gone are the days of using `+` and `,` in your print statements! As you may have noticed at the end of last week's lab, there's a strange way of printing _formatted strings_ (or, `f-string`s).

Take a look at the difference:

In [None]:
# I'm apparently hungry

loaves = 6
kind = "rye"

print("I have",loaves,"loaves of " + kind + " bread.")

In [None]:
# That's too complicated; plus, I found some more bread.
loaves += 4
#
#  Prepend the string with an f
#     |
#     |     Include variables surrounded by {}
#     |           |                  |
print(f"I have {loaves} loaves of {kind} bread.")

As long as we follow the rules that we already know about creating variables, we can do all kinds of calculations and insert their results into `string`s! We can even insert nonprinting characters!

In [None]:
breads = {
    "pumpernickel": 10,
    "challah": 1000,
    "8 grain": 2,
    "sourdough": 0 # Nope
}

for bread in breads:
    count = breads[bread]
    print(f"{bread}:\t{count}")

#### 4. Complete the following example

`print` the contents of the following table:
    
|Bird type | # seen |
|----------|--------|
|Crow      | 10 |
|Gull      | 2  |
|Pigeon    | 1  |
|Grackle   | 0  |
|Hawk      | 1  |

* You will need to create a `dictionary` to house the table's data
* Iterate over your `dictionary` with a `for` loop
* Separate `key`s or `value`s with `\t` characters
  * Use `f-string`s for this!

The final output should appear like this:

```
Crow	10
Gull	2
Pigeon  1
Grackle 0
Hawk	1
```

In [None]:
# TODO

### Finishing this activity

Test yourself by completing the [final activity](f1_week-1-worksheet-letters.md)!