# Deciding how to validate our input

Let's figure out how to do something in Python. For games that take user input, we need to check and ensure (validate) that the input has a certain format, or only has certain characters.

We will learn:

* How to take user input
* About the 'in' operator
* About list comprehensions


Let's prompt the user and store the result into a variable called `store_output_here`

## Checking the input

Let's set up a few variables and understand how to do this.

In [82]:
abcs = 'abcdefghijklmnopqrstuvwxyz'  # store an a-z string
abcs  # echo

'abcdefghijklmnopqrstuvwxyz'

Let's understand the `in` operator in Python:

In [83]:
'a' in abcs

True

In [84]:
'8' in abcs

False

In [85]:
'apple' in abcs

False

Okay, it looks like we can go through each character in the string, one at a time, and use the `character in abcs` expression. Let's look at how to step through each character using a `for` loop:

In [86]:
mine = 'yippee8'  # use this for testing
for character in mine:  # loop, where character contains 'y' the first time, 'i' the second time…
    print(character)  # output, one at a time

y
i
p
p
e
e
8


Let's combine the loop with the `in` operator:

In [87]:
for character in mine:  # loop, where character contains 'y' the first time, 'i' the second time…
    print(character in abcs)  # output the boolean 'True' or 'False'

True
True
True
True
True
True
False


Let's combine the `for` loop with an `if` statement and store our result as a boolean in `in_alpha`:

In [88]:
bool_alpha = True  # set True as the default value
for character in mine:  # loop
    if character not in abcs:  # check: notice the "not in"
        bool_alpha = False  # set to False!
bool_alpha  # output

False

Okay, now that we have a good algorythm, let's define a function so that we can re-use this bit of code over and over again:

In [89]:
def is_alpha_1(mine):
    ret = True  # default
    for character in mine:  # loop
        if character not in abcs:  # check
            ret = False  # set to False!
    return ret  # pass back

Test it:

In [90]:
is_alpha_1('allalphas')  # -> True

True

In [91]:
is_alpha_1('onenoalpha8')  # -> False

False

In [92]:
is_alpha_1('99999999999')  # -> False

False

Thinking like a computer scientist, we realize that in the last example, the computer is doing more work than is necessary. The very first character in the string `99999999999` is not an alpha, but then it continues checking the next ones. But we already know that the result should be `False` right away, so why bother continuing? Let's learn how stop as soon as we get a positive, by **short-circuiting**:

In [93]:
def is_alpha_2(mine):
    for character in mine:  # loop
        if character not in abcs:  # check
            return False  # we're done!
    return True  # default

Notice:
* We use direct `return` statements instead of a `ret` variable
* We use the `return` keyword twice
* The default value appears at the end, rather than at the beginning

Cool! So we have written a function that we can use. However, we are not done. We forgot to think about capital letters:

In [95]:
is_alpha_2('Woops')  # -> False (HUH? We are expecting True!)

False

Can you fix it yourself? We can easily modify the `is_alpha_2` function by using the `String.lower()` method. This is how to use it:

In [32]:
'A'.lower()  # convert to 'a'

'a'

In [96]:
if 'A'.lower() in abcs:
    print('Yes!')

Yes!


## Modify this:

For the challenges below, change the code here, and then re-run the boxes:

In [97]:
def is_alpha_3(mine):
    abcs = 'abcdefghijklmnopqrstuvwxyz'  # add this here
    for character in mine:  # loop
        if character not in abcs:  # check
            return False  # we're done!
    return True  # default

## Challenge:

In [98]:
is_alpha_3('aaaaa')  # -> True

True

In [99]:
is_alpha_3('AAAAA')  # -> True (did you correct it above?)

False

In [100]:
is_alpha_3('Nope8')  # -> False

False

In [101]:
is_alpha_3('this should also be FALSE!')  # -> False

False

In [102]:
is_alpha_3('What is wrong with this one')  # Expecting True, but False!

False

Woops. We didn't think about spaces! How to fix that bug?

## Answer key:

In [104]:
def is_alpha(mine):
    mine = mine.lower()  # convert to lowercase
    abcs = 'abcdefghijklmnopqrstuvwxyz '  # add this line here
    for character in mine:  # loop
        if character not in abcs:  # check
            return False  # we're done!
    return True  # default

### Playground

In [113]:
yours = input("Output string: ")  # use input built-in function to allow you to type
yours  # echo

Output string: Now isn't that special?


"Now isn't that special?"

In [114]:
is_alpha(yours)

False

## Answer Discussion:

In [155]:
def is_alpha_less_efficient(mine):
    abcs = 'abcdefghijklmnopqrstuvwxyz '  # include space in the string
    for character in mine:  # loop
        if character.lower() not in abcs:  # add String.lower here
            return False  # we're done!
    return True  # default

This function converts to lowercase one character at a time. While this works fine, our version is better because it is clearer what is happening. By converting the variable to a string immediately at the top, it is easier to read.

## Advanced

You will use more advanced tools to accomplish the same thing, including:

- Lists
- List comprehensions
- The `all()` built-in function

A list is like a container for group or a collection of items held together.

That is a list of characters, which is actually what string is! Lists can have any number of **elements**. Elements are separated by commas, and whitespace is used here to exaggerate its syntax:

In [148]:
[    'a' in abcs  ,  'b' in abcs  ,  'c' in abcs    ]  # list of 3 Boolean elements created with list comprehension

[True, True, True]

Using a similar syntax, we can also introduce the idea of a **list comprehension**, which is a fancy way to make a list. It uses a loop to create a list of any number of elements. The grammar looks like this, again using whitespace to enhance the punctuation:

In [137]:
[    character   for character in 'yes'    ]  # 3 elements, because there are 3 characters

['y', 'e', 's']

In your head, you can think of the grammar something like:

In [119]:
[element for element in 'elements']  # 8 elements

['e', 'l', 'e', 'm', 'e', 'n', 't', 's']

The built-in `all()` function takes a list and determines whether or not every element is True, and if so, returns True:

In [121]:
true_list = [True, True, True]  # 3 elements
all(true_list)  # -> True

True

That means that a function `is_alpha` using the `all()` built-in can be written thus:

In [156]:
def is_alpha_with_all(mine):
    list_of_booleans = [character.lower() in abcs for character in mine]
    print("List of bools: " + str(list_of_booleans))  # output below
    return all(list_of_booleans)  # uses all() built-in

In [157]:
is_alpha_with_all("Testing")  # -> True

List of bools: [True, True, True, True, True, True, True]


True

## More Advanced

There is another way to do this as well, using a concept known as a **regular expression**

In [22]:
import re  # "re" stands for _r_egular _e_xpression

This `import` statement allows us to use the re module from now on. Without it, these statements would fail:

In [140]:
re.sub('[a-z]', '9', 'ALL lowercase CHANGE TO 9s')  # -> ALL 999999999 CHANGE TO 99

'ALL 999999999 CHANGE TO 99'

The string `'[a-z]'` is what does the magic here. It allows us to go through each character (without using a loop!), match each character to `[a-z]` which means "the set of all letters from a to z" and check just like we did. Since we are using the `re.sub` method, the next string `'9'` replaces any matches with `9`. The result is that any lowercase letter is replaced with a `9`.

The challenging thing about using the `re` module is that it is actually a lot like using another programming language. In fact, the `[` and `]` has a special meaning as shown above, and so does the `^` (start of line) and `$` (end of line) and `+` (build up the sequence), which we use here:

In [164]:
def is_alpha_with_re(mine):
    ret = re.match('^[a-zA-Z ]+$', mine)  # notice added "A-Z" and a space after "a-z"
    print("Result from re.match:  " + str(ret))  # variable "ret" at this point is neither True or False
    return bool(ret)  # convert to True or False

In [165]:
is_alpha_with_re('This string does not have very many words')  # -> True

Result from re.match:  <_sre.SRE_Match object; span=(0, 41), match='This string does not have very many words'>


True

In [166]:
is_alpha_with_re('This string has more words with at least one non-alpha character.')  # -> False

Result from re.match:  None


False